Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaCloudera, Inc.
"While running a simple key/value based solution on HBase usually requires an equally simple schema, it is less trivial to operate a different application that has to insert thousands of records per second.
This talk will address the architectural challenges when designing for either read or write performance imposed by HBase. It will include examples of real world use-cases and how they can be implemented on top of HBase, using schemas that optimize for the given access patterns. "
Polyglot Persistence - Two Great Tastes That Taste Great TogetherJohn Wood
The days of the relational database being a one-stop-shop for all of your persistence needs are over. Although NoSQL databases address some issues that can’t be addressed by relational databases, the opposite is true as well. The relational database offers an unparalleled feature set and rock solid stability. One cannot underestimate the importance of using the right tool for the job, and for some jobs, one tool is not enough. This talk focuses on the strength and weaknesses of both relational and NoSQL databases, the benefits and challenges of polyglot persistence, and examples of polyglot persistence in the wild.
These slides were presented at WindyCityDB 2010.
This talk delves into the many ways that a user has to use HBase in a project. Lars will look at many practical examples based on real applications in production, for example, on Facebook and eBay and the right approach for those wanting to find their own implementation. He will also discuss advanced concepts, such as counters, coprocessors and schema design.
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaCloudera, Inc.
"While running a simple key/value based solution on HBase usually requires an equally simple schema, it is less trivial to operate a different application that has to insert thousands of records per second.
This talk will address the architectural challenges when designing for either read or write performance imposed by HBase. It will include examples of real world use-cases and how they can be implemented on top of HBase, using schemas that optimize for the given access patterns. "
Polyglot Persistence - Two Great Tastes That Taste Great TogetherJohn Wood
The days of the relational database being a one-stop-shop for all of your persistence needs are over. Although NoSQL databases address some issues that can’t be addressed by relational databases, the opposite is true as well. The relational database offers an unparalleled feature set and rock solid stability. One cannot underestimate the importance of using the right tool for the job, and for some jobs, one tool is not enough. This talk focuses on the strength and weaknesses of both relational and NoSQL databases, the benefits and challenges of polyglot persistence, and examples of polyglot persistence in the wild.
These slides were presented at WindyCityDB 2010.
This talk delves into the many ways that a user has to use HBase in a project. Lars will look at many practical examples based on real applications in production, for example, on Facebook and eBay and the right approach for those wanting to find their own implementation. He will also discuss advanced concepts, such as counters, coprocessors and schema design.
HBase Advanced Schema Design - Berlin Buzzwords - June 2012larsgeorge
While running a simple key/value based solution on HBase usually requires an equally simple schema, it is less trivial to operate a different application that has to insert thousands of records per second. This talk will address the architectural challenges when designing for either read or write performance imposed by HBase. It will include examples of real world use-cases and how they
http://berlinbuzzwords.de/sessions/advanced-hbase-schema-design
You can watch the replay for this Geek Sync webcast, Successfully Migrating Existing Databases to Azure SQL Database, on the IDERA Resource Center, http://ow.ly/k4p050A4rBA.
First impressions have long-lasting effects. When dealing with an architecture change like migrating to Azure SQL Database the last thing you want to do is leave a bad first impression by having an unsuccessful migration. In this session, you will learn the difference between Azure SQL Database, SQL Managed Instances, and Elastic Pools. How to use tools to test migrations for compatibility issues before you start the migration process. You will learn how to successfully migrate your database schema and data to the cloud. Finally, you will learn how to determine which performance tier is a good starting point for your existing workload(s) and how to monitor your workload over time to make sure your users have a great experience while you save as much money as possible.
Speaker: John Sterrett is an MCSE: Data Platform, Principal Consultant and the Founder of Procure SQL LLC. John has presented at many community events, including Microsoft Ignite, PASS Member Summit, SQLRally, 24 Hours of PASS, SQLSaturdays, PASS Chapters, and Virtual Chapter meetings. John is a leader of the Austin SQL Server User Group and the founder of the HADR Virtual Chapter.
Hadoop World 2011: Advanced HBase Schema DesignCloudera, Inc.
While running a simple key/value based solution on HBase usually requires an equally simple schema, it is less trivial to operate a different application that has to insert thousands of records per second.
This talk will address the architectural challenges when designing for either read or write performance imposed by HBase. It will include examples of real world use-cases and how they can be implemented on top of HBase, using schemas that optimize for the given access patterns.
Geek Sync | Designing Data Intensive Cloud Native ApplicationsIDERA Software
You can watch the replay for this Geek Sync webcast, Designing Data Intensive Cloud Native Applications, in the AquaFold Resource Center in the next week: http://ow.ly/gZ0g50A4rvR.
Cloud is rapidly changing the way modern-day applications are being designed. Data is at the center of multiple challenges while architecting solutions in the cloud.
With technology changing rapidly, there are new possibilities for processing data efficiently. Cloud Native is a combination of various patterns like DevOps, CI/CD, Containers, Orchestration, Microservices, and Cloud Infrastructure. In this session, you will learn more about the tools and technologies that will help you to design data-intensive systems. We will take a structured approach towards architecting data-centric solutions, covering technologies like message queues, data partitioning, search index, data cache, event sourcing, NoSQL solutions, microservices, and cloud migration strategies.
Join Samir Behara as he discusses the high-level design principles that will help you build scalable, resilient, and maintainable systems in the cloud.
Speaker: Samir Behara is a Solution Architect with EBSCO Industries and builds cloud native solutions using cutting edge technologies. He is a Microsoft Data Platform MVP with over 12 years of IT experience working on large-scaled applications involving complex business functionalities, web integration and data management. Samir is a frequent speaker at technical conferences and is the Co-Chapter Lead of the Steel City SQL Server User Group. He is the author of www.dotnetvibes.com.
HBase Applications - Atlanta HUG - May 2014larsgeorge
HBase is good a various workloads, ranging from sequential range scans to purely random access. These access patterns can be translated into application types, usually falling into two major groups: entities and events. This presentation discussed the underlying implications and how to approach those use-cases. Examples taken from Facebook show how this has been tackled in real life.
Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoo...Cloudera, Inc.
Facebook has one of the largest Apache Hadoop data warehouses in the world, primarily queried through Apache Hive for offline data processing and analytics. However, the need for realtime analytics and end-user access has led to the development of several new systems built using Apache HBase. This talk will cover specific use cases and the work done at Facebook around building large scale, low latency and high throughput realtime services with Hadoop and HBase. This includes several significant contributions to existing projects as well as the release of new open source projects.
Relational databases vs Non-relational databasesJames Serra
There is a lot of confusion about the place and purpose of the many recent non-relational database solutions ("NoSQL databases") compared to the relational database solutions that have been around for so many years. In this presentation I will first clarify what exactly these database solutions are, compare them, and discuss the best use cases for each. I'll discuss topics involving OLTP, scaling, data warehousing, polyglot persistence, and the CAP theorem. We will even touch on a new type of database solution called NewSQL. If you are building a new solution it is important to understand all your options so you take the right path to success.
End-to-end Data Governance with Apache Avro and AtlasDataWorks Summit
Aeolus is Comcast’s new internal Big Data system for providing access to an integrated view of a wide variety of high-quality, near-real-time and batch data. Such integration can enable data scientists to uncover otherwise hidden trends, anomalies, and powerful predictors of business successes and failures. But integrating data across silos in a large enterprise is fraught with peril. There typically are few standards on naming conventions and data representation, and spotty documentation at best. The old rule of thumb often applies: 70% of the analysts’ time goes into data wrangling, while only 30% goes toward the actual analyses and simulations. The goal of the Athene Data Governance Platform within Aeolus is to invert this ratio. This talk will explain how Comcast is using Apache Avro and Atlas for end-to-end data governance, the challenges faced, and methods used to address these challenges.
Avro provides a lingua franca for data representation, data integration, and schema evolution. All data published for community consumption must have an associated avro schema in Atlas. Every step in its journey through Aeolus, in flight or at rest, is captured in Atlas. Atlas’ extensibility has allowed us to add or update various entity types (e.g., avro schemas, kafka topics, object store pseudo-directories) and lineage types (e.g., storing streaming data in object storage; embellishing and re-publishing streaming data; performing aggregations and other transformations on data at rest; and evolution of schemas with compatibility flags). Transformation services notify Atlas of lineage links via custom asynchronous kafka messaging.
Atlas provides self-service data discovery and lineage browsing and querying, via full-text search, DSL query language, or gremlin graph query language. Example queries: “Where is data from kafka topic X stored?” “Display the journey of data currently stored in pseudo-directory X since it entered the Aeolus system”. “Show me all earlier versions of schema S, and whether they are forward/backward compatible with each other.”
Apachecon Europe 2012: Operating HBase - Things you need to knowChristian Gügi
If you’re running HBase in production, you have to be aware of many things. In this talk we will share our experience in running and operating an HBase production cluster for a customer. To avoid common pitfalls, we’ll discuss problems and challenges we’ve faced as well as practical solutions (real-world techniques) for repair.
Even though HBase provides internal tools for diagnosing issues and for repair, running a healthy cluster can still be challenging for an administrator. We'll cover some background on these tools as well as on HBase internals such as compaction, region splits and their distribution.
We'll also introduce our tool to visualize region sizing and distribution in the cluster, that we recently open sourced.
Apache HBase™ is the Hadoop database, a distributed, salable, big data store.Its a column-oriented database management system that runs on top of HDFS.
Apache HBase is an open source NoSQL database that provides real-time read/write access to those large data sets. ... HBase is natively integrated with Hadoop and works seamlessly alongside other data access engines through YARN.
PelletServer wraps a range of semantic technologies -- query, reasoning, machine learning, planning, and constraint solving -- in a RESTful interface and sensible set of defaults & conventions. Even the wily shell programmer can build semweb apps with wget!
Shard-Query, an MPP database for the cloud using the LAMP stackJustin Swanhart
This combined #SFMySQL and #SFPHP meetup talked about Shard-Query. You can find the video to accompany this set of slides here: https://www.youtube.com/watch?v=vC3mL_5DfEM
A machine learning and data science pipeline for real companiesDataWorks Summit
Comcast is one of the largest cable and telecommunications providers in the country built on decades of mergers, acquisitions, and subscriber growth. The success of our company depends on keeping our customers happy and how quickly we can pivot with changing trends and new technologies. Data abounds within our internal data centers and edge networks as well as both the private and public cloud across multiple vendors.
Within such an environment and given such challenges, how do we get AI, machine learning, and data science platforms built so our company can respond to the market, predict our customers’ needs and create new revenue generating products that delight our customers? If you don’t happen to be our friends and colleagues at Google, Facebook, and Amazon, what are technologies, strategies, and toolkits you can employ to bring together disparate data sets and quickly get them into the hands of your data scientists and then into your own production systems for use by your customers and business partners?
We’ll explore our journey and evolution and look at specific technologies and decisions that have gotten us to where we are today and demo how our platform works.
Speaker
Ray Harrison, Comcast, Enterprise Architect
Prashant Khanolkar, Comcast, Principal Architect Big Data
HBase Advanced Schema Design - Berlin Buzzwords - June 2012larsgeorge
While running a simple key/value based solution on HBase usually requires an equally simple schema, it is less trivial to operate a different application that has to insert thousands of records per second. This talk will address the architectural challenges when designing for either read or write performance imposed by HBase. It will include examples of real world use-cases and how they
http://berlinbuzzwords.de/sessions/advanced-hbase-schema-design
You can watch the replay for this Geek Sync webcast, Successfully Migrating Existing Databases to Azure SQL Database, on the IDERA Resource Center, http://ow.ly/k4p050A4rBA.
First impressions have long-lasting effects. When dealing with an architecture change like migrating to Azure SQL Database the last thing you want to do is leave a bad first impression by having an unsuccessful migration. In this session, you will learn the difference between Azure SQL Database, SQL Managed Instances, and Elastic Pools. How to use tools to test migrations for compatibility issues before you start the migration process. You will learn how to successfully migrate your database schema and data to the cloud. Finally, you will learn how to determine which performance tier is a good starting point for your existing workload(s) and how to monitor your workload over time to make sure your users have a great experience while you save as much money as possible.
Speaker: John Sterrett is an MCSE: Data Platform, Principal Consultant and the Founder of Procure SQL LLC. John has presented at many community events, including Microsoft Ignite, PASS Member Summit, SQLRally, 24 Hours of PASS, SQLSaturdays, PASS Chapters, and Virtual Chapter meetings. John is a leader of the Austin SQL Server User Group and the founder of the HADR Virtual Chapter.
Hadoop World 2011: Advanced HBase Schema DesignCloudera, Inc.
While running a simple key/value based solution on HBase usually requires an equally simple schema, it is less trivial to operate a different application that has to insert thousands of records per second.
This talk will address the architectural challenges when designing for either read or write performance imposed by HBase. It will include examples of real world use-cases and how they can be implemented on top of HBase, using schemas that optimize for the given access patterns.
Geek Sync | Designing Data Intensive Cloud Native ApplicationsIDERA Software
You can watch the replay for this Geek Sync webcast, Designing Data Intensive Cloud Native Applications, in the AquaFold Resource Center in the next week: http://ow.ly/gZ0g50A4rvR.
Cloud is rapidly changing the way modern-day applications are being designed. Data is at the center of multiple challenges while architecting solutions in the cloud.
With technology changing rapidly, there are new possibilities for processing data efficiently. Cloud Native is a combination of various patterns like DevOps, CI/CD, Containers, Orchestration, Microservices, and Cloud Infrastructure. In this session, you will learn more about the tools and technologies that will help you to design data-intensive systems. We will take a structured approach towards architecting data-centric solutions, covering technologies like message queues, data partitioning, search index, data cache, event sourcing, NoSQL solutions, microservices, and cloud migration strategies.
Join Samir Behara as he discusses the high-level design principles that will help you build scalable, resilient, and maintainable systems in the cloud.
Speaker: Samir Behara is a Solution Architect with EBSCO Industries and builds cloud native solutions using cutting edge technologies. He is a Microsoft Data Platform MVP with over 12 years of IT experience working on large-scaled applications involving complex business functionalities, web integration and data management. Samir is a frequent speaker at technical conferences and is the Co-Chapter Lead of the Steel City SQL Server User Group. He is the author of www.dotnetvibes.com.
HBase Applications - Atlanta HUG - May 2014larsgeorge
HBase is good a various workloads, ranging from sequential range scans to purely random access. These access patterns can be translated into application types, usually falling into two major groups: entities and events. This presentation discussed the underlying implications and how to approach those use-cases. Examples taken from Facebook show how this has been tackled in real life.
Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoo...Cloudera, Inc.
Facebook has one of the largest Apache Hadoop data warehouses in the world, primarily queried through Apache Hive for offline data processing and analytics. However, the need for realtime analytics and end-user access has led to the development of several new systems built using Apache HBase. This talk will cover specific use cases and the work done at Facebook around building large scale, low latency and high throughput realtime services with Hadoop and HBase. This includes several significant contributions to existing projects as well as the release of new open source projects.
Relational databases vs Non-relational databasesJames Serra
There is a lot of confusion about the place and purpose of the many recent non-relational database solutions ("NoSQL databases") compared to the relational database solutions that have been around for so many years. In this presentation I will first clarify what exactly these database solutions are, compare them, and discuss the best use cases for each. I'll discuss topics involving OLTP, scaling, data warehousing, polyglot persistence, and the CAP theorem. We will even touch on a new type of database solution called NewSQL. If you are building a new solution it is important to understand all your options so you take the right path to success.
End-to-end Data Governance with Apache Avro and AtlasDataWorks Summit
Aeolus is Comcast’s new internal Big Data system for providing access to an integrated view of a wide variety of high-quality, near-real-time and batch data. Such integration can enable data scientists to uncover otherwise hidden trends, anomalies, and powerful predictors of business successes and failures. But integrating data across silos in a large enterprise is fraught with peril. There typically are few standards on naming conventions and data representation, and spotty documentation at best. The old rule of thumb often applies: 70% of the analysts’ time goes into data wrangling, while only 30% goes toward the actual analyses and simulations. The goal of the Athene Data Governance Platform within Aeolus is to invert this ratio. This talk will explain how Comcast is using Apache Avro and Atlas for end-to-end data governance, the challenges faced, and methods used to address these challenges.
Avro provides a lingua franca for data representation, data integration, and schema evolution. All data published for community consumption must have an associated avro schema in Atlas. Every step in its journey through Aeolus, in flight or at rest, is captured in Atlas. Atlas’ extensibility has allowed us to add or update various entity types (e.g., avro schemas, kafka topics, object store pseudo-directories) and lineage types (e.g., storing streaming data in object storage; embellishing and re-publishing streaming data; performing aggregations and other transformations on data at rest; and evolution of schemas with compatibility flags). Transformation services notify Atlas of lineage links via custom asynchronous kafka messaging.
Atlas provides self-service data discovery and lineage browsing and querying, via full-text search, DSL query language, or gremlin graph query language. Example queries: “Where is data from kafka topic X stored?” “Display the journey of data currently stored in pseudo-directory X since it entered the Aeolus system”. “Show me all earlier versions of schema S, and whether they are forward/backward compatible with each other.”
Apachecon Europe 2012: Operating HBase - Things you need to knowChristian Gügi
If you’re running HBase in production, you have to be aware of many things. In this talk we will share our experience in running and operating an HBase production cluster for a customer. To avoid common pitfalls, we’ll discuss problems and challenges we’ve faced as well as practical solutions (real-world techniques) for repair.
Even though HBase provides internal tools for diagnosing issues and for repair, running a healthy cluster can still be challenging for an administrator. We'll cover some background on these tools as well as on HBase internals such as compaction, region splits and their distribution.
We'll also introduce our tool to visualize region sizing and distribution in the cluster, that we recently open sourced.
Apache HBase™ is the Hadoop database, a distributed, salable, big data store.Its a column-oriented database management system that runs on top of HDFS.
Apache HBase is an open source NoSQL database that provides real-time read/write access to those large data sets. ... HBase is natively integrated with Hadoop and works seamlessly alongside other data access engines through YARN.
PelletServer wraps a range of semantic technologies -- query, reasoning, machine learning, planning, and constraint solving -- in a RESTful interface and sensible set of defaults & conventions. Even the wily shell programmer can build semweb apps with wget!
Shard-Query, an MPP database for the cloud using the LAMP stackJustin Swanhart
This combined #SFMySQL and #SFPHP meetup talked about Shard-Query. You can find the video to accompany this set of slides here: https://www.youtube.com/watch?v=vC3mL_5DfEM
A machine learning and data science pipeline for real companiesDataWorks Summit
Comcast is one of the largest cable and telecommunications providers in the country built on decades of mergers, acquisitions, and subscriber growth. The success of our company depends on keeping our customers happy and how quickly we can pivot with changing trends and new technologies. Data abounds within our internal data centers and edge networks as well as both the private and public cloud across multiple vendors.
Within such an environment and given such challenges, how do we get AI, machine learning, and data science platforms built so our company can respond to the market, predict our customers’ needs and create new revenue generating products that delight our customers? If you don’t happen to be our friends and colleagues at Google, Facebook, and Amazon, what are technologies, strategies, and toolkits you can employ to bring together disparate data sets and quickly get them into the hands of your data scientists and then into your own production systems for use by your customers and business partners?
We’ll explore our journey and evolution and look at specific technologies and decisions that have gotten us to where we are today and demo how our platform works.
Speaker
Ray Harrison, Comcast, Enterprise Architect
Prashant Khanolkar, Comcast, Principal Architect Big Data
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInLinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn. This was a presentation made at QCon 2009 and is embedded on LinkedIn's blog - http://blog.linkedin.com/
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
Data Mesh is an innovative concept addressing many data challenges from an architectural, cultural, and organizational perspective. But is the world ready to implement Data Mesh?
In this session, we will review the importance of core Data Mesh principles, what they can offer, and when it is a good idea to try a Data Mesh architecture. We will discuss common challenges with implementation of Data Mesh systems and focus on the role of open-source projects for it. Projects like Apache Spark can play a key part in standardized infrastructure platform implementation of Data Mesh. We will examine the landscape of useful data engineering open-source projects to utilize in several areas of a Data Mesh system in practice, along with an architectural example. We will touch on what work (culture, tools, mindset) needs to be done to ensure Data Mesh is more accessible for engineers in the industry.
The audience will leave with a good understanding of the benefits of Data Mesh architecture, common challenges, and the role of Apache Spark and other open-source projects for its implementation in real systems.
This session is targeted for architects, decision-makers, data-engineers, and system designers.
Version Your Cloud: Using Perforce to Manage Your Object StoragePerforce
Jeppesen uses a combination of their own data centers and a variety of 3rd party cloud providers to support their distribution of navigation data to most of the world’s pilots. But before any data leaves the building it goes through Perforce. Learn why Jeppesen chose to put Perforce at the heart of their global data distribution system, and how you too could leverage Perforce to help provide atomic changesets and true history tracking to your cloud-scale object stores.
I work in a Data Innovation Lab with a horde of Data Scientists. Data Scientists gather data, clean data, apply Machine Learning algorithms and produce results, all of that with specialized tools (Dataiku, Scikit-Learn, R...). These processes run on a single machine, on data that is fixed in time, and they have no constraint on execution speed.
With my fellow Developers, our goal is to bring these processes to production. Our constraints are very different: we want the code to be versioned, to be tested, to be deployed automatically and to produce logs. We also need it to run in production on distributed architectures (Spark, Hadoop), with fixed versions of languages and frameworks (Scala...), and with data that changes every day.
In this talk, I will explain how we, Developers, work hand-in-hand with Data Scientists to shorten the path to running data workflows in production.
Webinar: MongoDB Management Service (MMS): Session 02 - Backing up DataMongoDB
In this session we will cover how MMS can be used to backup and restore your data from MongoDB. We will cover backing up both replica sets and sharded environments, and discuss how to do point-in-time restores and restoring from snapshots.
Presented by, Sam Weaver:
Sam Weaver is a Senior Solution Architect at MongoDB based in London. Prior to MongoDB, he worked at Red Hat doing technical pre-sales on the product portfolio including Linux, Virtualisation and Middleware. Originally from Cheltenham, England he received his Bachelors from Cardiff University and currently lives in Camberley, Surrey.
Visit www.mongodb.com/presentations to view other presentations in this series and many other topics.
The Hadoop Ecosystem for developers session in DevGeekWeek in Israel.
This was a day long session talking about big data problems and the hadoop solution. we also talked about Spark and NoSQL.
What is Data Warehousing? ,
Who needs Data Warehousing? ,
Why Data Warehouse is required? ,
Types of Systems ,
OLTP
OLAP
Maintenance of Data Warehouse
Data Warehousing Life Cycle
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Mind IT Systems
Healthcare providers often struggle with the complexities of chronic conditions and remote patient monitoring, as each patient requires personalized care and ongoing monitoring. Off-the-shelf solutions may not meet these diverse needs, leading to inefficiencies and gaps in care. It’s here, custom healthcare software offers a tailored solution, ensuring improved care and effectiveness.
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
Utilocate offers a comprehensive solution for locate ticket management by automating and streamlining the entire process. By integrating with Geospatial Information Systems (GIS), it provides accurate mapping and visualization of utility locations, enhancing decision-making and reducing the risk of errors. The system's advanced data analytics tools help identify trends, predict potential issues, and optimize resource allocation, making the locate ticket management process smarter and more efficient. Additionally, automated ticket management ensures consistency and reduces human error, while real-time notifications keep all relevant personnel informed and ready to respond promptly.
The system's ability to streamline workflows and automate ticket routing significantly reduces the time taken to process each ticket, making the process faster and more efficient. Mobile access allows field technicians to update ticket information on the go, ensuring that the latest information is always available and accelerating the locate process. Overall, Utilocate not only enhances the efficiency and accuracy of locate ticket management but also improves safety by minimizing the risk of utility damage through precise and timely locates.
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteGoogle
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-pilot-review/
AI Pilot Review: Key Features
✅Deploy AI expert bots in Any Niche With Just A Click
✅With one keyword, generate complete funnels, websites, landing pages, and more.
✅More than 85 AI features are included in the AI pilot.
✅No setup or configuration; use your voice (like Siri) to do whatever you want.
✅You Can Use AI Pilot To Create your version of AI Pilot And Charge People For It…
✅ZERO Manual Work With AI Pilot. Never write, Design, Or Code Again.
✅ZERO Limits On Features Or Usages
✅Use Our AI-powered Traffic To Get Hundreds Of Customers
✅No Complicated Setup: Get Up And Running In 2 Minutes
✅99.99% Up-Time Guaranteed
✅30 Days Money-Back Guarantee
✅ZERO Upfront Cost
See My Other Reviews Article:
(1) TubeTrivia AI Review: https://sumonreview.com/tubetrivia-ai-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisGlobus
JASMIN is the UK’s high-performance data analysis platform for environmental science, operated by STFC on behalf of the UK Natural Environment Research Council (NERC). In addition to its role in hosting the CEDA Archive (NERC’s long-term repository for climate, atmospheric science & Earth observation data in the UK), JASMIN provides a collaborative platform to a community of around 2,000 scientists in the UK and beyond, providing nearly 400 environmental science projects with working space, compute resources and tools to facilitate their work. High-performance data transfer into and out of JASMIN has always been a key feature, with many scientists bringing model outputs from supercomputers elsewhere in the UK, to analyse against observational or other model data in the CEDA Archive. A growing number of JASMIN users are now realising the benefits of using the Globus service to provide reliable and efficient data movement and other tasks in this and other contexts. Further use cases involve long-distance (intercontinental) transfers to and from JASMIN, and collecting results from a mobile atmospheric radar system, pushing data to JASMIN via a lightweight Globus deployment. We provide details of how Globus fits into our current infrastructure, our experience of the recent migration to GCSv5.4, and of our interest in developing use of the wider ecosystem of Globus services for the benefit of our user community.
Top 7 Unique WhatsApp API Benefits | Saudi ArabiaYara Milbes
Discover the transformative power of the WhatsApp API in our latest SlideShare presentation, "Top 7 Unique WhatsApp API Benefits." In today's fast-paced digital era, effective communication is crucial for both personal and professional success. Whether you're a small business looking to enhance customer interactions or an individual seeking seamless communication with loved ones, the WhatsApp API offers robust capabilities that can significantly elevate your experience.
In this presentation, we delve into the top 7 distinctive benefits of the WhatsApp API, provided by the leading WhatsApp API service provider in Saudi Arabia. Learn how to streamline customer support, automate notifications, leverage rich media messaging, run scalable marketing campaigns, integrate secure payments, synchronize with CRM systems, and ensure enhanced security and privacy.
Enterprise Resource Planning System includes various modules that reduce any business's workload. Additionally, it organizes the workflows, which drives towards enhancing productivity. Here are a detailed explanation of the ERP modules. Going through the points will help you understand how the software is changing the work dynamics.
To know more details here: https://blogs.nyggs.com/nyggs/enterprise-resource-planning-erp-system-modules/
Software Engineering, Software Consulting, Tech Lead.
Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Security,
Spring Transaction, Spring MVC,
Log4j, REST/SOAP WEB-SERVICES.
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
In this slide, we show the simulation example and the way to compile this solver.
In this solver, the Helmholtz equation can be solved by helmholtzFoam. Also, the Helmholtz equation with uniformly dispersed bubbles can be simulated by helmholtzBubbleFoam.
5. • The time required to develop games is increasing
6. • The time required to develop games is increasing
• The manpower and computational cost of developing games is increasing
7. • The time required to develop games is increasing
• The manpower and computational cost of developing games is increasing
• The unit price of a game has remained relatively stable
10. • Player expectations for graphical fidelity are increasing
• Player expectations for the scale of the user experience are increasing
11. • Player expectations for graphical fidelity are increasing
• Player expectations for the scale of the user experience are increasing
• Developer expectations for our tools to meet this demand are increasing
12. • This problem has manifested itself in the form of impacted iteration times:
– For Artists
– For Designers
– For Programmers
– For Production
– For Operations
13. • This problem has manifested itself in the form of impacted iteration times:
• Questions:
– How long until a content author sees the result of changing ____ ?
– How long until a content author can commit a change to ____ ?
14. • This problem has manifested itself in the form of impacted iteration times:
• Questions:
• Answers:
– Desired? Instantly
– Reality? Possibly minutes to hours
17. The game content pipeline
• Purpose:
– Transform unmeaningful data into meaningful data
18. The game content pipeline
• Purpose:
– Transform unmeaningful data into meaningful data
– Often, that means source data into production data
19. The game content pipeline
• Purpose:
– Transform unmeaningful data into meaningful data
– Often, that means source data into production data
– Production data is content that ends up in a player’s hands
20. The game content pipeline
• Purpose:
• Game content:
– Meshes
– Textures
– Animations
– Shaders
– Materials
– Physics
– Scripts
– Narrative
– Audio
21. The game content pipeline
• Purpose:
• Game content:
– Heterogeneous:
• Has different intrinsic characteristics (formats, disk-space impact)
• Requires different applied transformations
22. The game content pipeline
• Purpose:
• Game content:
– Heterogeneous:
• Has different intrinsic characteristics (formats, disk-space impact)
• Requires different applied transformations
– Hierarchical relationships
• Dependencies
• Content identity
23. The game content pipeline
• Philosophy:
– Create a unified content pipeline system and architecture at our studio
24. The game content pipeline
• Philosophy:
– Create a unified content pipeline system and architecture at our studio
– Properties:
• Correct
• Expressive
• Explicit
• Homogenous
• Scalable
• Debuggable
25. This is a lecture on how we think and reason about
achieving these desired properties
34. Data transformation functions
• Purpose:
• Example:
• Invariants:
– Serializable
• Both inputs and outputs are representable in different contexts
– Memory mapped
– File system
– Networked
35. Data transformation functions
• Purpose:
• Example:
• Invariants:
– Serializable
– Transferrable
• Usable within a game’s pipeline or run-time
36. Data transformation functions
• Purpose:
• Example:
• Invariants:
– Serializable
– Transferrable
– Side-effect free
• Safely usable in a concurrent system
• Knowable state and always debuggable
37. Data transformation functions
• Purpose:
• Example:
• Invariants:
– Serializable
– Transferrable
– Side-effect free
– Discrete
• Conceptually simple to reason about
• Represents a single unit of transformation
38. Data transformation functions
• Purpose:
• Example:
• Invariants:
– Serializable
– Transferrable
– Side-effect free
– Discrete
– Atomic
• Transformation either did or did not occur, there is no half-way
39. Data transformation functions
• Goals:
– Develop a library of transformations
– Define an appropriate granularity
– Feverously respect invariants
40. Data transformation functions
• Goals:
– Develop a library of transformations
• Needed to represent the full scope of game data
– Define an appropriate granularity
– Feverously respect invariants
41. Data transformation functions
• Goals:
– Develop a library of transformations
– Define an appropriate granularity
• Base judgment on time and resources required for the transform
– Feverously respect invariants
42. Data transformation functions
• Goals:
– Develop a library of transformations
– Define an appropriate granularity
– Feverously respect invariants
• This will allow powerful architectural opportunities later
43. Dependency graph
• Purpose:
– Track the global state of data and relationships
– Serves as an interface to both data and transformations
45. Dependency graph
• Purpose:
• Data dependency graph
– The relationship network of data files
– Example:
o Skybox
o Cubemap
o 1D Texture (six)
o Cubemap Texture
• Transformation dependency graph
46. Dependency graph
• Purpose:
• Data dependency graph
– The relationship network of data files
– Example:
– Properties:
• File identity (as a path or GUID)
• Connectivity to and from other files
• Reachability to and from other files
• Transformation dependency graph
47. Dependency graph
• Purpose:
• Data dependency graph
• Transformation dependency graph
– The relationship network of data transformation functions
– Example:
o
o
o
48. Dependency graph
• Purpose:
• Data dependency graph
• Transformation dependency graph
– The relationship network of data transformation functions
– Example:
– Properties and differences from data dependency graph:
• Requires critical path evaluation for a desired result
• Requires deciding if and when a transformation occurs
• Performance critical
50. Storage
• Purpose:
• Considerations:
– Often a tradeoff between latency, capacity, and longevity
• Example: low latency, in-memory key-value database
• Example: high latency, networked file system
51. Storage
• Purpose:
• Considerations:
– Often a tradeoff between latency, capacity, and longevity
– Needed for many areas in a pipeline:
• Source data
• Intermediate data
• Inter/Intraprocess communication
• Production data
52. Storage
• Purpose:
• Considerations:
– Often a tradeoff between latency, capacity, and longevity
– Needed for many areas in a pipeline:
– Should never be directly interacted with by transformation functions:
• Fundamentally breaks invariants
• Should be orchestrated at a higher level
– During evaluation of the transformation dependency graph
54. Computational work
• Iteration times:
– For content authors, improving iteration is purely a metric of time
• The amount of computation required is not relevant
55. Computational work
• Iteration times:
– For content authors, improving iteration is purely a metric of time
• The amount of computation required is not relevant
– Often, more computation results in more time required
57. Computational work
• Iteration times:
• Mitigation strategies:
– Data model
• Reduction of transformation steps required to see result
– Transformations are unnecessary or optional
– Direct optimization or removal of transformations
– Concurrency
– Avoidance
58. Computational work
• Iteration times:
• Mitigation strategies:
– Data model
• Reduction of transformation steps required to see result
• Ideal to address, the best general optimization is to do less work
– Concurrency
– Avoidance
59. Computational work
• Iteration times:
• Mitigation strategies:
– Data model
– Concurrency
• Leveraging the invariants of our data transformation functions
• Available contexts:
– Threads
– Processes
– Network
– Avoidance
60. Computational work
• Iteration times:
• Mitigation strategies:
– Data model
– Concurrency
– Avoidance
• Hierarchal caching contexts (more detail later):
– In-memory
– File system
– Network
61. Computational work
• Iteration times:
• Mitigation strategies:
– Data model
– Concurrency
– Avoidance
• Hierarchal caching contexts:
• Pruning action based on criteria:
– Unneeded for desired result
– Redundant
63. Conceptual architectures
• Live iteration
– Working directly in the game to see changes
– Streaming data to the game through an external tool
• Asset baking
64. Conceptual architectures
• Live iteration
– Working directly in the game to see changes
– Streaming data to the game through an external tool
– Properties:
• Has the benefit of user seeing instant results
• Has the problem of data transience
– Edit state could become corrupt
– Edit state could encounter an irrecoverable error (crashes)
• Has the problem of needing to reflect changes into source data
– Other assets may have dependence on changes
• Asset baking
66. Conceptual architectures
• Live iteration
• Asset baking
– Using an offline toolset to “cook” or “bake” assets for the game
– Properties:
• Has the benefit of permanence
• Has the benefit of simple directional flow of changes
– Source intermediate production
• Has the problem, often, of increased computational cost
• Has the problem, often, of non-uniformity
– Example: Different teams having different file formats
– Resulting in a more complex unified pipeline
67. Conceptual architectures
• Live iteration
• Asset baking
• Philosophy:
– Live iteration workflows and asset baking can be reconciled
– Both concepts are shades of the same architectural principles
68. Conceptual architectures
• Live iteration
• Asset baking
• Philosophy:
• Desired pipeline architecture:
– Transferrable library of available data transformations
– Full data dependency graph of all game content
– Computable transformation dependency graphs
– Generalized hierarchical caching model
72. Data transformation functions
• Lesson: Define the data formats
• Lesson: Discipline in respecting invariants
• Lesson: Transferability is crucial
73. Data transformation functions
• Lesson: Define the data formats
– Developed an in-house object serialization library
• Similar to Google Flat Buffers and Protocol Buffers
• Lesson: Discipline in respecting invariants
• Lesson: Transferability is crucial
74. Data transformation functions
• Lesson: Define the data formats
– Developed an in-house object serialization library
• Similar to Google Flat Buffers and Protocol Buffers
• Designed for:
– Memory-mappability
– Binary format for speed, Json format for debugging
– Data layout optimization
– Versioning and Identification
– Automatic API generation
– Custom allocators
• Lesson: Discipline in respecting invariants
• Lesson: Transferability is crucial
75. Data transformation functions
• Lesson: Define the data formats
• Lesson: Discipline in respecting invariants
– Allows for concurrency guarantees
– Allows for data, build, and game state guarantees
• Lesson: Transferability is crucial
76. Data transformation functions
• Lesson: Define the data formats
• Lesson: Discipline in respecting invariants
• Lesson: Transferability is crucial
– Same library of functions can be used at bake time and run time
– Opportunity for the transformation dependency graph to
computationally decide where to execute transform:
• A thread
• A local process
• A networked process
77. Dependency Graphs
• Lesson: Most research required here
• Lesson: Describing the flow of a network
• Lesson: Graph operators
78. Dependency Graphs
• Lesson: Most research required here
– Open problem: Scaling computation to millions of nodes
– Open problem: Cache coherent spatial data structures and processing
– Open Problem: Extremely performant transformation dependency
graph evaluation and iterative construction
• Lesson: Describing the flow of a network
• Lesson: Graph operators
79. Dependency Graphs
• Lesson: Most research required here
• Lesson: Describing the flow of a network
– Transformation dependency graph
• Required a language (DSL) to describe the transformation network
• Transformations known as commands:
– Examples of granularity:
» compress_texture
» create_animation_clip
• DSL results in vast amount of automatic code generation
• Lesson: Graph operators
80. Dependency Graphs
• Lesson: Most research required here
• Lesson: Describing the flow of a network
• Lesson: Graph operators
– Transformation dependency graph traversal:
• Two operators could encapsulate our most complex operations
• Reductions:
– Compute a subgraph under a node as input to a command
• Maps:
– Direct input node to a command
81. Storage
• Lesson: File system by itself is very limiting
• Lesson: Creating a global asset library
82. Storage
• Lesson: File system by itself is very limiting
– Butting heads with Windows file system rules is painful:
• Path lengths (256 characters)
• Path and file name uniqueness
– Often results in insane naming conventions for files
• Lesson: Creating a global asset library
83. Storage
• Lesson: File system by itself is very limiting
• Lesson: Creating a global asset library
– Many reasons for desirability:
• Freedom for custom identification and file versioning
• Freedom from file system path and naming conventions
• Filenames on physical disk can be mangled, while the asset library
can ensure human readability
84. Caching
• Lesson: Hierarchical caching is a powerful model
– Goal: Never re-compute a function with the same parameters.
– Achieved by decorating functions with appropriate caching strategies:
• Match a function with the most appropriate caching semantic
• Driven by the nature of input and output data
• Example:
– File-based parameters cached to the file system or network
– Normal function parameters cached in memory
85. Caching
• Lesson: Hierarchical caching is a powerful model
– Any piece of computation can be decorated with caching
• Found it to be most beneficial with:
– Path and string manipulation
– Data transformation functions
86. Caching
• Lesson: Hierarchical caching is a powerful model
– Any piece of computation can be decorated with caching
– Hiding caching structure behind generalized interfaces and decorators
allowed for experimentation.
• Proved to be invaluable when we switched our file-system based
key-value store from SQLite to LMDB
87. Caching
• Lesson: Hierarchical caching is a powerful model
– Any piece of computation can be decorated with caching
– Hiding caching structure behind generalized interfaces and decorators
allowed for experimentation.
– Our current implementations:
• In-memory: achieved with function level memoization
• File-system: achieved with LMDB
• Networked: achieved with an in-house technology backed by CEPH
(a networked file system object store)
90. Hard Problems
• Building the foundation
• Dependency graph scalability
• Dependency graph decision making
91. Hard Problems
• Building the foundation
– Most studios, including ours have a long history of tool’s development
– Difficult to modernize and justify changing monolithic, stable tools
– Large upfront cost for changing data formats
• Dependency graph scalability
• Dependency graph decision making
92. Hard Problems
• Building the foundation
• Dependency graph scalability
• Dependency graph decision making
– Performant and meaningful computation against large graphs is
difficult but not insurmountable.
– Pushing automated decision making to cover:
• Caching strategies
• Appropriate contexts for data transformations