Datomic – A Modern Database - StampedeCon 2014StampedeCon
At StampedeCon 2014, Alex Miller (Cognitect) presented "Datomic – A Modern Database."
Datomic is a distributed database designed to run on next-generation cloud architectures. Datomic stores facts and retractions using a flexible schema, consistent transactions, and a logic-based query language. The focus on facts over time gives you the ability to look at the state of the database at any point in time and traverse your transactional data in many ways.
We’ll take a tour of the Datomic data model, transactions, query language, and architecture to highlight some of the unique attributes of Datomic and why it is an ideal modern database.
Service Oriented Architecture -Unit II - Modeling databases in xml Roselin Mary S
Modeling databases in xml
Steps:
1. Review the database schema.
2. Construct the desired XML document.
3. Define a schema for the XML document.
4. Create the JAXB binding schema.
5. Generate the JAXB classes based on the schema.
6. Develop a Data Access Object (DAO).
7. Develop a servlet for HTTP access.
Datomic – A Modern Database - StampedeCon 2014StampedeCon
At StampedeCon 2014, Alex Miller (Cognitect) presented "Datomic – A Modern Database."
Datomic is a distributed database designed to run on next-generation cloud architectures. Datomic stores facts and retractions using a flexible schema, consistent transactions, and a logic-based query language. The focus on facts over time gives you the ability to look at the state of the database at any point in time and traverse your transactional data in many ways.
We’ll take a tour of the Datomic data model, transactions, query language, and architecture to highlight some of the unique attributes of Datomic and why it is an ideal modern database.
Service Oriented Architecture -Unit II - Modeling databases in xml Roselin Mary S
Modeling databases in xml
Steps:
1. Review the database schema.
2. Construct the desired XML document.
3. Define a schema for the XML document.
4. Create the JAXB binding schema.
5. Generate the JAXB classes based on the schema.
6. Develop a Data Access Object (DAO).
7. Develop a servlet for HTTP access.
MongoDB for Coder Training (Coding Serbia 2013)Uwe Printz
Slides of my MongoDB Training given at Coding Serbia Conference on 18.10.2013
Agenda:
1. Introduction to NoSQL & MongoDB
2. Data manipulation: Learn how to CRUD with MongoDB
3. Indexing: Speed up your queries with MongoDB
4. MapReduce: Data aggregation with MongoDB
5. Aggregation Framework: Data aggregation done the MongoDB way
6. Replication: High Availability with MongoDB
7. Sharding: Scaling with MongoDB
Modeling JSON data for NoSQL document databasesRyan CrawCour
Modeling data in a relational database is easy, we all know how to do it because that's what we've always been taught; But what about NoSQL Document Databases?
Document databases take (much) of what you know and flip it upside down. This talk covers some common patterns for modeling data and how to approach things when working with document stores such as Azure DocumentDB
FITC presents: Mobile & offline data synchronization in Angular JSFITC
Save 10% off ANY FITC event with discount code 'slideshare'
See our upcoming events at www.fitc.ca
OVERVIEW
Are you building mobile or web applications with AngularJS and wish they would work when you were offline? You can read, send and delete mail from your mobile email client when you are offline, why not from your AngularJS app? AngularJS is completely agnostic when it comes to creating your data models. Let’s explore what is required to allow your application to be useful to your users even without an internet connection.
INTENDED AUDIENCE - BEGINNER - INTERMEDIATE
This presentation is for developers that know they are looking for offline and data synchronization capabilities. Or, possibly for managers that wish to have a greater understanding of what their options are in AngularJS to create such functionality.
Daniel Zen, CEO, Zen Digital
Daniel Zen is the CEO of Zen Digital, founder of the New York AngularJS Meetup, a frequent lecturer, and a former consultant for Google, Pivotal Labs and various Fortune 500 companies. Zen Digital uses Agile techniques to move projects forward while continuously integrating new code and ideas, producing elegant frontend experiences and efficient backend systems for web and mobile applications.
High level look at RavenDB features presented as a 10 minute lightning talk at the Nov 19 2013 BTVWag.org meeting of 8 lightning talks on NoSQL databases.
Connect 2016-Move Your XPages Applications to the Fast LaneHoward Greenberg
Are your XPages applications performing like a Florida senior citizen driving in the left lane at 55 mph? A key to speeding up your XPages applications is knowledge of the JSF lifecycle, partial refresh and partial execution. This session will cover these concepts and then apply them to optimizing an XPages application. Learn how to use tools to measure the performance of your XPages and determine where the bottlenecks are. Several sample applications will be analyzed along with alternative programming choices to improve their performance. Learn how to dramatically increase your XPages performance and make your users happy - you might even get a speeding ticket after this session!
[PASS Summit 2016] Azure DocumentDB: A Deep Dive into Advanced FeaturesAndrew Liu
Let's talk about how you can get the most out of Azure DocumentDB. In this session we will dive deep into the mechanics of DocumentDB and explain the various levers available to tune performance and scale. From partitioned collections to global databases to advanced indexing and query features - this session will equip you with the best practices and nuggets of information that will become invaluable tools in your toolbox for building blazingly fast large-scale applications.
Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...Aaron Saray
Object Oriented Programming in enterprise level PHP is incredibly important. In this presentation, concepts like MVC architecture, data mappers, services, and domain and data models will be discussed. Simple demonstrations will be used to show patterns and best practices. In addition, using tools like Doctrine or integration with Salesforce or the AS/400 will also be discussed. There will be an emphasis on the practical application of these techniques as well - this isn't just a theoretical talk! This presentation is great for those just beginning to create enterprise applications as well as those who have had years of experience.
This advanced session will cover topics on how to leverage both CFML and ORM to start creating amazing applications that will be as lethal as a dinosaur riding a shark with an Uzi. We will cover ORM session management, virtual service layers, dynamic finders, dynamic counters and an enhanced Hibernate Criteria builder to create easy and programmatic HQL queries.
MongoDB for Coder Training (Coding Serbia 2013)Uwe Printz
Slides of my MongoDB Training given at Coding Serbia Conference on 18.10.2013
Agenda:
1. Introduction to NoSQL & MongoDB
2. Data manipulation: Learn how to CRUD with MongoDB
3. Indexing: Speed up your queries with MongoDB
4. MapReduce: Data aggregation with MongoDB
5. Aggregation Framework: Data aggregation done the MongoDB way
6. Replication: High Availability with MongoDB
7. Sharding: Scaling with MongoDB
Modeling JSON data for NoSQL document databasesRyan CrawCour
Modeling data in a relational database is easy, we all know how to do it because that's what we've always been taught; But what about NoSQL Document Databases?
Document databases take (much) of what you know and flip it upside down. This talk covers some common patterns for modeling data and how to approach things when working with document stores such as Azure DocumentDB
FITC presents: Mobile & offline data synchronization in Angular JSFITC
Save 10% off ANY FITC event with discount code 'slideshare'
See our upcoming events at www.fitc.ca
OVERVIEW
Are you building mobile or web applications with AngularJS and wish they would work when you were offline? You can read, send and delete mail from your mobile email client when you are offline, why not from your AngularJS app? AngularJS is completely agnostic when it comes to creating your data models. Let’s explore what is required to allow your application to be useful to your users even without an internet connection.
INTENDED AUDIENCE - BEGINNER - INTERMEDIATE
This presentation is for developers that know they are looking for offline and data synchronization capabilities. Or, possibly for managers that wish to have a greater understanding of what their options are in AngularJS to create such functionality.
Daniel Zen, CEO, Zen Digital
Daniel Zen is the CEO of Zen Digital, founder of the New York AngularJS Meetup, a frequent lecturer, and a former consultant for Google, Pivotal Labs and various Fortune 500 companies. Zen Digital uses Agile techniques to move projects forward while continuously integrating new code and ideas, producing elegant frontend experiences and efficient backend systems for web and mobile applications.
High level look at RavenDB features presented as a 10 minute lightning talk at the Nov 19 2013 BTVWag.org meeting of 8 lightning talks on NoSQL databases.
Connect 2016-Move Your XPages Applications to the Fast LaneHoward Greenberg
Are your XPages applications performing like a Florida senior citizen driving in the left lane at 55 mph? A key to speeding up your XPages applications is knowledge of the JSF lifecycle, partial refresh and partial execution. This session will cover these concepts and then apply them to optimizing an XPages application. Learn how to use tools to measure the performance of your XPages and determine where the bottlenecks are. Several sample applications will be analyzed along with alternative programming choices to improve their performance. Learn how to dramatically increase your XPages performance and make your users happy - you might even get a speeding ticket after this session!
[PASS Summit 2016] Azure DocumentDB: A Deep Dive into Advanced FeaturesAndrew Liu
Let's talk about how you can get the most out of Azure DocumentDB. In this session we will dive deep into the mechanics of DocumentDB and explain the various levers available to tune performance and scale. From partitioned collections to global databases to advanced indexing and query features - this session will equip you with the best practices and nuggets of information that will become invaluable tools in your toolbox for building blazingly fast large-scale applications.
Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...Aaron Saray
Object Oriented Programming in enterprise level PHP is incredibly important. In this presentation, concepts like MVC architecture, data mappers, services, and domain and data models will be discussed. Simple demonstrations will be used to show patterns and best practices. In addition, using tools like Doctrine or integration with Salesforce or the AS/400 will also be discussed. There will be an emphasis on the practical application of these techniques as well - this isn't just a theoretical talk! This presentation is great for those just beginning to create enterprise applications as well as those who have had years of experience.
This advanced session will cover topics on how to leverage both CFML and ORM to start creating amazing applications that will be as lethal as a dinosaur riding a shark with an Uzi. We will cover ORM session management, virtual service layers, dynamic finders, dynamic counters and an enhanced Hibernate Criteria builder to create easy and programmatic HQL queries.
Palestra que o Rodrigo Flores e eu apresentamos no TDC 2015 na trilha NoSQL. Nesta palestra apresentamos uma introdução básica ao banco de Dados Datomic, e porque ele é ótimo para nossas necessidades no Nubank.
Learn about Hitchhiker Trees from David Greenberg, a new functional, immutable, persistent variation of a fractal tree. In these slides, we'll learn how to understand immutable data strucutres and a variety of trees, introducing new concepts as we build up to the hitchhiker tree.
Python Data Wrangling: Preparing for the FutureWes McKinney
Given at PyCon HK on October 29, 2016. About open source work in progress to advance the Python pandas project internals and leverage synergies with other efforts in OSS data technology
Improving Python and Spark (PySpark) Performance and InteroperabilityWes McKinney
Slides from Spark Summit East 2017 — February 9, 2017 in Boston. Discusses ongoing development work to accelerate Python-on-Spark performance using Apache Arrow and other tools
Mesos: The Operating System for your DatacenterDavid Greenberg
Maybe you’ve heard of Mesos—that thing that you can run Hadoop on. I think it powers Twitter? Isn’t it an Apache project, or something?
In this talk, we’ll learn all about Mesos—what it is, how you can leverage it to simplify your infrastructure and reduce AWS/cloud computing costs, and why you should develop your next application on top of it. This talk will give you the tools you need to understand whether Mesos is the right fit for your infrastructure, and several starting points for learning more about Mesos.
Data scientists often have a different background and priorities than software engineers. A lot of the code Data Scientists write never makes it to production, and as a result, the code might not always meet the same standards as production-ready code in a developer team. While it makes sense to have rather lax requirements on code for one-off analyses, this can lead to difficulties in maintaining production code and collaborating on projects with software engineers. Since production code is not (always) the main output of a data science team, it can also be hard to prioritize code quality.
In this presentation, we will go over some of the main principles of clean code and talk about practical steps that data science teams can take to improve their code. We will specifically focus on strategies that teams can implement to slowly and steadily improve the existing code base. This talk is aimed at data scientists who may not have a strong background in software engineering, but are interested in improving code quality and collaborating more effectively with software engineering teams.
Introducing Apache Spark's Data Frames and Dataset APIs workshop seriesHolden Karau
This session of the workshop introduces Spark SQL along with DataFrames, Datasets. Datasets give us the ability to easily intermix relational and functional style programming. So that we can explore the new Dataset API this iteration will be focused in Scala.
Introduction to Spark Datasets - Functional and relational together at lastHolden Karau
Spark Datasets are an evolution of Spark DataFrames which allow us to work with both functional and relational transformations on big data with the speed of Spark.
itle: Glorp Tutorial
Speaker: Niall Ross
Mon, August 18, 2:00pm – 3:30pm
Video Part1: https://www.youtube.com/watch?v=cPN1A4WQyiA
Video Part2: https://www.youtube.com/watch?v=25S6cSYgh34
Abstract:
The target audience for this hands-on tutorial is those with little or no Glorp experience (but more experienced people willing to pair-program with beginners are most welcome). The tutorial will help them to start using in Glorp in their own applications.
Participants will create a simple Glorp descriptor system for a domain model. They will generate a database from it, incorporating some existing legacy. They will write and read between the database and their domain model using Glorp commands. The issues of transactions, caching and refreshing will be addressed.
Swift offers a compelling opportunity to build reliable, scalable and lightweight micro-services. This talk will discuss how I am investigating the use of Swift micro-services - running in ECS and potentially Lambda, fronted by API gateway - with AWS technologies such as managing data models with DynamoDb.
Introduction to InfluxDB, an Open Source Distributed Time Series Database by ...Hakka Labs
In this presentation, Paul introduces InfluxDB, a distributed time series database that he open sourced based on the backend infrastructure at Errplane. He talks about why you'd want a database specifically for time series and he covers the API and some of the key features of InfluxDB, including:
• Stores metrics (like Graphite) and events (like page views, exceptions, deploys)
• No external dependencies (self contained binary)
• Fast. Handles many thousands of writes per second on a single node
• HTTP API for reading and writing data
• SQL-like query language
• Distributed to scale out to many machines
• Built in aggregate and statistics functions
• Built in downsampling
Beyond Wordcount with spark datasets (and scalaing) - Nide PDX Jan 2018Holden Karau
Apache Spark is one of the most popular big data systems, but once the shiny finish starts to wear off you can find yourself wondering if you've accidentally deployed a Ford Pinto into production. This talk will look at the challenges that come with scaling Spark jobs. Also, the talk will explore Spark's new(ish) Dataset/DataFrame API, as well as how it’s evolving in Spark 2.3 with improved Python support.
If you're already a Spark user, come to find out why it’s not all your fault. If you aren't already a Spark user, come to find out how to save yourself from some of the pitfalls once you move beyond the example code.
Check out Holden's newest book, High Performance Spark, for more information!
From https://niketechtalksjan2018.splashthat.com/
Dart is a new language for the web, enabling you to write JavaScript on a secure and manageable way. No need to worry about "JavaScript: The bad parts".
This presentation concentrates on the developer experience converting from the Java based GWT to Dart.
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
Your Digital Assistant.
Making complex approach simple. Straightforward process saves time. No more waiting to connect with people that matter to you. Safety first is not a cliché - Securely protect information in cloud storage to prevent any third party from accessing data.
Would you rather make your visitors feel burdened by making them wait? Or choose VizMan for a stress-free experience? VizMan is an automated visitor management system that works for any industries not limited to factories, societies, government institutes, and warehouses. A new age contactless way of logging information of visitors, employees, packages, and vehicles. VizMan is a digital logbook so it deters unnecessary use of paper or space since there is no requirement of bundles of registers that is left to collect dust in a corner of a room. Visitor’s essential details, helps in scheduling meetings for visitors and employees, and assists in supervising the attendance of the employees. With VizMan, visitors don’t need to wait for hours in long queues. VizMan handles visitors with the value they deserve because we know time is important to you.
Feasible Features
One Subscription, Four Modules – Admin, Employee, Receptionist, and Gatekeeper ensures confidentiality and prevents data from being manipulated
User Friendly – can be easily used on Android, iOS, and Web Interface
Multiple Accessibility – Log in through any device from any place at any time
One app for all industries – a Visitor Management System that works for any organisation.
Stress-free Sign-up
Visitor is registered and checked-in by the Receptionist
Host gets a notification, where they opt to Approve the meeting
Host notifies the Receptionist of the end of the meeting
Visitor is checked-out by the Receptionist
Host enters notes and remarks of the meeting
Customizable Components
Scheduling Meetings – Host can invite visitors for meetings and also approve, reject and reschedule meetings
Single/Bulk invites – Invitations can be sent individually to a visitor or collectively to many visitors
VIP Visitors – Additional security of data for VIP visitors to avoid misuse of information
Courier Management – Keeps a check on deliveries like commodities being delivered in and out of establishments
Alerts & Notifications – Get notified on SMS, email, and application
Parking Management – Manage availability of parking space
Individual log-in – Every user has their own log-in id
Visitor/Meeting Analytics – Evaluate notes and remarks of the meeting stored in the system
Visitor Management System is a secure and user friendly database manager that records, filters, tracks the visitors to your organization.
"Secure Your Premises with VizMan (VMS) – Get It Now"
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
Modern design is crucial in today's digital environment, and this is especially true for SharePoint intranets. The design of these digital hubs is critical to user engagement and productivity enhancement. They are the cornerstone of internal collaboration and interaction within enterprises.
A Comprehensive Look at Generative AI in Retail App Testing.pdfkalichargn70th171
Traditional software testing methods are being challenged in retail, where customer expectations and technological advancements continually shape the landscape. Enter generative AI—a transformative subset of artificial intelligence technologies poised to revolutionize software testing.
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Globus
The U.S. Geological Survey (USGS) has made substantial investments in meeting evolving scientific, technical, and policy driven demands on storing, managing, and delivering data. As these demands continue to grow in complexity and scale, the USGS must continue to explore innovative solutions to improve its management, curation, sharing, delivering, and preservation approaches for large-scale research data. Supporting these needs, the USGS has partnered with the University of Chicago-Globus to research and develop advanced repository components and workflows leveraging its current investment in Globus. The primary outcome of this partnership includes the development of a prototype enterprise repository, driven by USGS Data Release requirements, through exploration and implementation of the entire suite of the Globus platform offerings, including Globus Flow, Globus Auth, Globus Transfer, and Globus Search. This presentation will provide insights into this research partnership, introduce the unique requirements and challenges being addressed and provide relevant project progress.
In software engineering, the right architecture is essential for robust, scalable platforms. Wix has undergone a pivotal shift from event sourcing to a CRUD-based model for its microservices. This talk will chart the course of this pivotal journey.
Event sourcing, which records state changes as immutable events, provided robust auditing and "time travel" debugging for Wix Stores' microservices. Despite its benefits, the complexity it introduced in state management slowed development. Wix responded by adopting a simpler, unified CRUD model. This talk will explore the challenges of event sourcing and the advantages of Wix's new "CRUD on steroids" approach, which streamlines API integration and domain event management while preserving data integrity and system resilience.
Participants will gain valuable insights into Wix's strategies for ensuring atomicity in database updates and event production, as well as caching, materialization, and performance optimization techniques within a distributed system.
Join us to discover how Wix has mastered the art of balancing simplicity and extensibility, and learn how the re-adoption of the modest CRUD has turbocharged their development velocity, resilience, and scalability in a high-growth environment.
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
In this slide, we show the simulation example and the way to compile this solver.
In this solver, the Helmholtz equation can be solved by helmholtzFoam. Also, the Helmholtz equation with uniformly dispersed bubbles can be simulated by helmholtzBubbleFoam.
Strategies for Successful Data Migration Tools.pptxvarshanayak241
Data migration is a complex but essential task for organizations aiming to modernize their IT infrastructure and leverage new technologies. By understanding common challenges and implementing these strategies, businesses can achieve a successful migration with minimal disruption. Data Migration Tool like Ask On Data play a pivotal role in this journey, offering features that streamline the process, ensure data integrity, and maintain security. With the right approach and tools, organizations can turn the challenge of data migration into an opportunity for growth and innovation.
Designing for Privacy in Amazon Web ServicesKrzysztofKkol1
Data privacy is one of the most critical issues that businesses face. This presentation shares insights on the principles and best practices for ensuring the resilience and security of your workload.
Drawing on a real-life project from the HR industry, the various challenges will be demonstrated: data protection, self-healing, business continuity, security, and transparency of data processing. This systematized approach allowed to create a secure AWS cloud infrastructure that not only met strict compliance rules but also exceeded the client's expectations.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
Understanding Globus Data Transfers with NetSageGlobus
NetSage is an open privacy-aware network measurement, analysis, and visualization service designed to help end-users visualize and reason about large data transfers. NetSage traditionally has used a combination of passive measurements, including SNMP and flow data, as well as active measurements, mainly perfSONAR, to provide longitudinal network performance data visualization. It has been deployed by dozens of networks world wide, and is supported domestically by the Engagement and Performance Operations Center (EPOC), NSF #2328479. We have recently expanded the NetSage data sources to include logs for Globus data transfers, following the same privacy-preserving approach as for Flow data. Using the logs for the Texas Advanced Computing Center (TACC) as an example, this talk will walk through several different example use cases that NetSage can answer, including: Who is using Globus to share data with my institution, and what kind of performance are they able to achieve? How many transfers has Globus supported for us? Which sites are we sharing the most data with, and how is that changing over time? How is my site using Globus to move data internally, and what kind of performance do we see for those transfers? What percentage of data transfers at my institution used Globus, and how did the overall data transfer performance compare to the Globus users?
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
4. Main features
● No update-in place
● Only assertions and retractions
● Retraction != removal
● Immutable data
● Database as a value
● Flexible schema
● ACID transactions
● Declarative and logic data programming
9. Time
● By-product of ordering states within an entity
● Linear progression
● Morning => ElfenLied
● Afternoon => John
● Evening => ElfenLied
10. Datom
● Basic unit of operation in Datomic
● Also called a fact
● Made up of five parts
11. Datom example
Entity attribute value Tx operation
2439 name John at work add
2439 name John playing retract
2439 name ElfenLied playing add
12. Not only “now”
● Easy access to past and future states
● What was the value of x one week ago?
● What was the state of DB before app crashed?
● Not so easy in RDBMS
13. Everything is data
● Datomic operates on data
● Lists, maps, vectors, keywords, strings
● Even functions, schema is data
● Possible to run queries on all data
14. Transactor
● Single instance
● Handles all writes
● Second instance on stand-by
● Ensures ACID properties
● Notifies transaction submitter and all peers
(other applications) when transaction is
persisted
15. Storage
● Not directly in Datomic
● Leverages storage services:
– DynamoDB
– Riak
– Infinispan
– SQL storages
● Storing segments of datoms not individual ones
● Log storage – chronological
● Indices storage – various orders of datoms
16. Indices
● Entity / Attribute / Value / Transaction
● Sorted order
● EAVT – all datoms (SQL “row-like” view)
● AEVT – all datoms (SQL “column-like” view)
● AVET – unique datoms
● VAET – reference attributes
17. EAVT
● EAVT – all datoms (SQL “row-like” view)
134 name Tim 4592 add
586 city 32 1975 add
586 gender male 4592 add
586 name John 4592 add
976 name Rob 4938 add
18. AEVT
● AEVT – all datoms (SQL “column-like” view)
city 986 32 1975 add
name 134 Tim 4592 add
name 576 Rob 4938 add
name 986 John 4592 add
title 367 Lord Jim 4592 add
19. AVET
● AVET – unique datoms
city 32 986 1975 add
name John 986 4592 add
name Rob 576 4938 add
name Tim 134 4592 add
title Lord Jim 367 4592 add
20. VAET
● VAET – reference attributes
15 author 841 1975 add
32 city 134 4592 add
269 city 576 4938 add
517 city 986 4592 add
male gender 986 4592 add
21. Value of DB
● View of db is a value
● Immutable
● Graph of entities and their attributes
● Direct iterable access to indices
● DB value is a param to queries and functions
22. Datalog
● Default query language
● Declarative – WHAT not HOW
● Logic – patterns matching
● Logical variables
23. Query structure
● :find – projection clause, similar to SELECT
● :in – binding arguments
– Implicit for some queries
● :where – restriction clause
● Most of the logic would be in :where clause
43. DB function components
● Function is a data structure
● Declared with :db/fn
● Optional docs
● Clojure or java
● Parameters
● Require (Clojure) or import (Java) block
● Your code
49. Database filters
● Filter DB value based on some predicate
● Keep only relevant datoms
● Built-in filters and custom filters
● Filters allow for one set of queries operating on
different db values
50. As-of
● Returns DB value “as of” particular point in time
● Ignores any transactions after that point
● Point-in-time could be:
– Transaction id
– java.util.Date instance
– Time-basis (t) of database
● What was the DB last week, month etc?
52. Since
● Opposite of as-of
● Returns value of database that includes only
datoms added after certain point in time
● What were the transactions after X point in
time?
62. ● Still more from Datomic:
– Negation in query
– Retraction
– Excision – true removal
– Partitioning
– Transactions
– with – state with proposed additions
● What would happen if we did x?
– More in depth on covered topics