Presentation by Gerben de Boer (van Oord) at the Symposium Earth Observation and Data Science, during Delft Software Days - Edition 2017. Thursday, 2 November 2017, Delft.
DSD-INT 2017 High Performance Parallel Computing with iMODFLOW-MetaSWAP - Ver...Deltares
Presentation by Jarno Verkaik (Deltares) at the iMOD International User Day, during Delft Software Days - Edition 2017. Tuesday, 31 October 2017, Delft.
Using Ceph for Large Hadron Collider DataRob Gardner
Talk by Lincoln Bryant (University of Chicago ATLAS team) on using Ceph for ATLAS data analysis @ Ceph Days Chicago http://ceph.com/cephdays/ceph-day-chicago/
Deploying and running applications globally, at scale, is becoming easier and faster with Kubernetes. From a single container Express app to microservice-powered service meshes, cloud-native applications are the norm. In this session, Christopher Bradford explores the aspects you’ll want to consider when taking the database of your application global including:
Deployment topologies like hybrid and multi-cloud
Infrastructure and configuration planning
Resiliency and planning for failure
Security and privacy
Application connectivity
After this session, you’ll be equipped with the bigger picture of what it takes to move your data operations onto Kubernetes in a globally distributed space, as part of a cloud-native stack.
http://bit.ly/1ALVcwR – MapR Director of Architecture and Enterprise Strategy Jim Scott presented a session titled “Time Series Data in a Time Series World.” His session focused on working with time series data including single-value, geospatial and log time series data. By focusing on enterprise applications and the data center, OpenTSDB will be used as an example to explain some of the key time series core concepts including when to use different storage models.
Things Expo | San Jose, California - November 2014
Taking Your Database Beyond the Border of a Single Kubernetes ClusterChristopher Bradford
Deploying applications on Kubernetes is getting easier every day. From a minimal deployment to distributed service mesh enabled applications with planning and a little bit of YAML resilient cloud-native applications are the norm. In this session, Christopher Bradford and Ty Morton will help answer the following questions: - What about your data behind these apps? - Are you running those in a multi-cluster environment or sending everything back to a common location? - How do you modernize to a distributed peer-to-peer data architecture? - How do you plan for this change? - Are there pitfalls on the road to enlightened data? Join this session to explore the key concepts needed when investigating multi-cluster deployments for data. This includes: - Cluster planning - Network design - Security - Failure handling
A key feature when monitoring and debugging any Cloud infrastructure is to provide the ability to trace, track, and collate all the individual, discrete steps that compose an event. A typical resource action in OpenStack is often a combination of smaller tasks -- which given the distributed nature of OpenStack -- can fail at unpredictable points in the workflow. By collecting the appropriate events, operators can view all events within Ceilometer, filter on a failed action and trace back the history of related events to spot anomalies or errors. In this talk, we provide an overview of the recent enhancements made in Ceilometer to support the collection of event notifications from OpenStack services. We will describe: how events are processed, transformed and stored in Ceilometer; how you can derive metrics from events; and how it’s possible to track the events of a resource and analyse where errors occur.
DSD-INT 2017 High Performance Parallel Computing with iMODFLOW-MetaSWAP - Ver...Deltares
Presentation by Jarno Verkaik (Deltares) at the iMOD International User Day, during Delft Software Days - Edition 2017. Tuesday, 31 October 2017, Delft.
Using Ceph for Large Hadron Collider DataRob Gardner
Talk by Lincoln Bryant (University of Chicago ATLAS team) on using Ceph for ATLAS data analysis @ Ceph Days Chicago http://ceph.com/cephdays/ceph-day-chicago/
Deploying and running applications globally, at scale, is becoming easier and faster with Kubernetes. From a single container Express app to microservice-powered service meshes, cloud-native applications are the norm. In this session, Christopher Bradford explores the aspects you’ll want to consider when taking the database of your application global including:
Deployment topologies like hybrid and multi-cloud
Infrastructure and configuration planning
Resiliency and planning for failure
Security and privacy
Application connectivity
After this session, you’ll be equipped with the bigger picture of what it takes to move your data operations onto Kubernetes in a globally distributed space, as part of a cloud-native stack.
http://bit.ly/1ALVcwR – MapR Director of Architecture and Enterprise Strategy Jim Scott presented a session titled “Time Series Data in a Time Series World.” His session focused on working with time series data including single-value, geospatial and log time series data. By focusing on enterprise applications and the data center, OpenTSDB will be used as an example to explain some of the key time series core concepts including when to use different storage models.
Things Expo | San Jose, California - November 2014
Taking Your Database Beyond the Border of a Single Kubernetes ClusterChristopher Bradford
Deploying applications on Kubernetes is getting easier every day. From a minimal deployment to distributed service mesh enabled applications with planning and a little bit of YAML resilient cloud-native applications are the norm. In this session, Christopher Bradford and Ty Morton will help answer the following questions: - What about your data behind these apps? - Are you running those in a multi-cluster environment or sending everything back to a common location? - How do you modernize to a distributed peer-to-peer data architecture? - How do you plan for this change? - Are there pitfalls on the road to enlightened data? Join this session to explore the key concepts needed when investigating multi-cluster deployments for data. This includes: - Cluster planning - Network design - Security - Failure handling
A key feature when monitoring and debugging any Cloud infrastructure is to provide the ability to trace, track, and collate all the individual, discrete steps that compose an event. A typical resource action in OpenStack is often a combination of smaller tasks -- which given the distributed nature of OpenStack -- can fail at unpredictable points in the workflow. By collecting the appropriate events, operators can view all events within Ceilometer, filter on a failed action and trace back the history of related events to spot anomalies or errors. In this talk, we provide an overview of the recent enhancements made in Ceilometer to support the collection of event notifications from OpenStack services. We will describe: how events are processed, transformed and stored in Ceilometer; how you can derive metrics from events; and how it’s possible to track the events of a resource and analyse where errors occur.
Building a Data Plane with K8ssandra, Apache Cassandra on KubernetesChristopher Bradford
K8ssandra has made it effortless to deploy Apache Cassandra on Kubernetes. Long a simple means of deploying stateless applications, modern tooling and APIs has facilitated the move of databases to this pervasive platform. Join Chris Bradford in deploying the K8ssandra stack to Kubernetes. Learn how it packages a production Cassandra deployment with supporting tooling alongside Stargate, a next generation data gateway. We will explore everything from the management interfaces leveraged by DevOps teams to performant, highly available, REST, Graph, and Document APIs for developers.
Pachyderm: Building a Big Data Beast On KubernetesKubeAcademy
Pachyderm is a containerized data analytics solution that's completely deployed using Kubernetes. We take all the amazing tools and potential in the container ecosystem and unlock that power for massive-scale data processing. In this talk we'll show you how to leverage Docker, Kubernetes, and Pachyderm, to build incredibly robust and scalable data infrastructure. We'll start by discussing the key components of a modern data-drive company and how your infrastructure choices can have a massive impact on your product and scalability roadmap. We'll then dive into some architecture details to show how Kubernetes, Docker, and Pachyderm all work in tandem to create a cohesive data infrastructure stack. Finally, we will demonstrate some high-level use cases and powerful benefits you get from the architecture we've outlined.
KubeCon schedule link: http://sched.co/4WWA
CERN, the European Organization for Nuclear Research, is running for several years a large OpenStack Cloud that helps thousands of scientists to analyze the data from the LHC.
In 2012, early in the design phase of the CERN Cloud we decided to use Nova Cells to enable the infrastructure to scale to thousands of nodes. Now with more than 280K cores spread across 70 cells that are hosted in two data centres we were faced with the challenge to migrate to Nova Cells V2 required in the Pike release.
In this presentation, we will describe how Nova Cells allowed CERN to scale to thousands of nodes, its advantages and how we mitigate the implementation issues of Nova Cells V1. Next, we will cover how we upgraded Nova from Newton with Cells V1 to Pike with Cells V2. We will explain the steps that we followed and the issues that we faced during the upgrade. Finally, we will report our experience with Cells V2 at scale, its caveats and how we are mitigate them.
What can I expect to learn?
This presentation describes how CERN migrated from Cells V1 to Cells V2 when upgraded from Newton to Pike release.
You will learn the procedures followed by CERN in order to migrate Cells V1 to Cells V2 in a large production environment.
The issues found during the upgrade and how we mitigate them will be discussed.
Also, we will present how Cells V2 behaves in a large scale deployment with serveral thounsands nodes in 70 cells.
The next generation of research infrastructure and large scale scientific instruments will face new magnitudes of data.
This talk presents two flagship programmes: the next generation of the Large Hadron Collider (LHC) at CERN and the Square Kilometre Array (SKA) radio telescope. Each in their way will push infrastructure to the limit.
The LHC has been one of the significant users of OpenStack in scientific computing. The SKA is now working to a final software architecture design and is focusing on OpenStack as an underlying middleware function.
Together, we plan to develop a common platform for scaling science: to accommodate new applications and software services, to deliver high ingest rate real-time and batch processing, to integrate high performance storage and to unlock the potential of software defined networking.
Realtime Indexing for Fast Queries on Massive Semi-Structured DataScyllaDB
Rockset is a realtime indexing database that powers fast SQL over semi-structured data such as JSON, Parquet, or XML without requiring any schematization. All data loaded into Rockset are automatically indexed and a fully featured SQL engine powers fast queries over semi-structured data without requiring any database tuning. Rockset exploits the hardware fluidity available in the cloud and automatically grows and shrinks the cluster footprint based on demand. Available as a serverless cloud service, Rockset is used by developers to build data-driven applications and microservices.
In this talk, we discuss some of the key design aspects of Rockset, such as Smart Schema and Converged Index. We describe Rockset's Aggregator Leaf Tailer (ALT) architecture that provides low latency queries on large datasets.Then we describe how you can combine lightweight transactions in ScyllaDB with realtime analytics on Rockset to power an user-facing application.
Containers on Baremetal and Preemptible VMs at CERN and SKABelmiro Moreira
CERN the European Laboratory for Nuclear Research and SKA the Square Kilmeter Array are preparing the next generation of research infrastructure for the new large scale scientific instruments that will produce new magnitudes of data. In Sydney OpenStack Summit we presented the collaboration and the platform that we plan to develop for scaling science.
In this talk will present the work done related with Preemptible VMs and Containers on Baremetal.
Preemptible VMs are instances that use idle allocated resources in the infrastructure and can be terminated when this capacity is required. Containers in Baremetal eliminate the virtualization overhead enabling the container full performance required for scientific workloads.
We will present the current state, development and integration decisions and how these functionalities can be used in a common OpenStack infrastructure.
Flink Forward Berlin 2017: Dr. Radu Tudoran - Huawei Cloud Stream Service in ...Flink Forward
Huawei Cloud Stream Service uses Flink internally. Cloud Stream firstly uses Flink as an internal job executing engine. Kinesis use Storm, Alibaba Stream Compute service use Storm. At the end of this year we will support Flink run on the kubernetes and Mesos at cloud, also CEP on SQL and other features. The presentation will show how to create a serverless cloud service from zero, how to provide streaming features with Flink, how to operate the service with quantization and visualization (by collecting yarn/flink/os metrics in real-time). This service was developed from zero and only cost around three months.
Ceph Object Storage Reference Architecture Performance and Sizing GuideKaran Singh
Together with my colleagues at Red Hat Storage Team, i am very proud to have worked on this reference architecture for Ceph Object Storage.
If you are building Ceph object storage at scale, this document is for you.
Openstack Swift is a very powerful object storage that is used in several of the largest object storage deployments around the globe. It ensures a very high level of data durability and can withstand epic disasters if setup in the right way.
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...Rittman Analytics
Most DBAs are aware something interesting is going on with big data and the Hadoop product ecosystem that underpins it, but aren't so clear about what each component in the stack does, what problem each part solves and why those problems couldn't be solved using the old approach. We'll look at where it's all going with the advent of Spark and machine learning, what's happening with ETL, metadata and analytics on this platform ... why IaaS and datawarehousing-as-a-service will have such a big impact, sooner than you think
Building a Data Plane with K8ssandra, Apache Cassandra on KubernetesChristopher Bradford
K8ssandra has made it effortless to deploy Apache Cassandra on Kubernetes. Long a simple means of deploying stateless applications, modern tooling and APIs has facilitated the move of databases to this pervasive platform. Join Chris Bradford in deploying the K8ssandra stack to Kubernetes. Learn how it packages a production Cassandra deployment with supporting tooling alongside Stargate, a next generation data gateway. We will explore everything from the management interfaces leveraged by DevOps teams to performant, highly available, REST, Graph, and Document APIs for developers.
Pachyderm: Building a Big Data Beast On KubernetesKubeAcademy
Pachyderm is a containerized data analytics solution that's completely deployed using Kubernetes. We take all the amazing tools and potential in the container ecosystem and unlock that power for massive-scale data processing. In this talk we'll show you how to leverage Docker, Kubernetes, and Pachyderm, to build incredibly robust and scalable data infrastructure. We'll start by discussing the key components of a modern data-drive company and how your infrastructure choices can have a massive impact on your product and scalability roadmap. We'll then dive into some architecture details to show how Kubernetes, Docker, and Pachyderm all work in tandem to create a cohesive data infrastructure stack. Finally, we will demonstrate some high-level use cases and powerful benefits you get from the architecture we've outlined.
KubeCon schedule link: http://sched.co/4WWA
CERN, the European Organization for Nuclear Research, is running for several years a large OpenStack Cloud that helps thousands of scientists to analyze the data from the LHC.
In 2012, early in the design phase of the CERN Cloud we decided to use Nova Cells to enable the infrastructure to scale to thousands of nodes. Now with more than 280K cores spread across 70 cells that are hosted in two data centres we were faced with the challenge to migrate to Nova Cells V2 required in the Pike release.
In this presentation, we will describe how Nova Cells allowed CERN to scale to thousands of nodes, its advantages and how we mitigate the implementation issues of Nova Cells V1. Next, we will cover how we upgraded Nova from Newton with Cells V1 to Pike with Cells V2. We will explain the steps that we followed and the issues that we faced during the upgrade. Finally, we will report our experience with Cells V2 at scale, its caveats and how we are mitigate them.
What can I expect to learn?
This presentation describes how CERN migrated from Cells V1 to Cells V2 when upgraded from Newton to Pike release.
You will learn the procedures followed by CERN in order to migrate Cells V1 to Cells V2 in a large production environment.
The issues found during the upgrade and how we mitigate them will be discussed.
Also, we will present how Cells V2 behaves in a large scale deployment with serveral thounsands nodes in 70 cells.
The next generation of research infrastructure and large scale scientific instruments will face new magnitudes of data.
This talk presents two flagship programmes: the next generation of the Large Hadron Collider (LHC) at CERN and the Square Kilometre Array (SKA) radio telescope. Each in their way will push infrastructure to the limit.
The LHC has been one of the significant users of OpenStack in scientific computing. The SKA is now working to a final software architecture design and is focusing on OpenStack as an underlying middleware function.
Together, we plan to develop a common platform for scaling science: to accommodate new applications and software services, to deliver high ingest rate real-time and batch processing, to integrate high performance storage and to unlock the potential of software defined networking.
Realtime Indexing for Fast Queries on Massive Semi-Structured DataScyllaDB
Rockset is a realtime indexing database that powers fast SQL over semi-structured data such as JSON, Parquet, or XML without requiring any schematization. All data loaded into Rockset are automatically indexed and a fully featured SQL engine powers fast queries over semi-structured data without requiring any database tuning. Rockset exploits the hardware fluidity available in the cloud and automatically grows and shrinks the cluster footprint based on demand. Available as a serverless cloud service, Rockset is used by developers to build data-driven applications and microservices.
In this talk, we discuss some of the key design aspects of Rockset, such as Smart Schema and Converged Index. We describe Rockset's Aggregator Leaf Tailer (ALT) architecture that provides low latency queries on large datasets.Then we describe how you can combine lightweight transactions in ScyllaDB with realtime analytics on Rockset to power an user-facing application.
Containers on Baremetal and Preemptible VMs at CERN and SKABelmiro Moreira
CERN the European Laboratory for Nuclear Research and SKA the Square Kilmeter Array are preparing the next generation of research infrastructure for the new large scale scientific instruments that will produce new magnitudes of data. In Sydney OpenStack Summit we presented the collaboration and the platform that we plan to develop for scaling science.
In this talk will present the work done related with Preemptible VMs and Containers on Baremetal.
Preemptible VMs are instances that use idle allocated resources in the infrastructure and can be terminated when this capacity is required. Containers in Baremetal eliminate the virtualization overhead enabling the container full performance required for scientific workloads.
We will present the current state, development and integration decisions and how these functionalities can be used in a common OpenStack infrastructure.
Flink Forward Berlin 2017: Dr. Radu Tudoran - Huawei Cloud Stream Service in ...Flink Forward
Huawei Cloud Stream Service uses Flink internally. Cloud Stream firstly uses Flink as an internal job executing engine. Kinesis use Storm, Alibaba Stream Compute service use Storm. At the end of this year we will support Flink run on the kubernetes and Mesos at cloud, also CEP on SQL and other features. The presentation will show how to create a serverless cloud service from zero, how to provide streaming features with Flink, how to operate the service with quantization and visualization (by collecting yarn/flink/os metrics in real-time). This service was developed from zero and only cost around three months.
Ceph Object Storage Reference Architecture Performance and Sizing GuideKaran Singh
Together with my colleagues at Red Hat Storage Team, i am very proud to have worked on this reference architecture for Ceph Object Storage.
If you are building Ceph object storage at scale, this document is for you.
Openstack Swift is a very powerful object storage that is used in several of the largest object storage deployments around the globe. It ensures a very high level of data durability and can withstand epic disasters if setup in the right way.
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...Rittman Analytics
Most DBAs are aware something interesting is going on with big data and the Hadoop product ecosystem that underpins it, but aren't so clear about what each component in the stack does, what problem each part solves and why those problems couldn't be solved using the old approach. We'll look at where it's all going with the advent of Spark and machine learning, what's happening with ETL, metadata and analytics on this platform ... why IaaS and datawarehousing-as-a-service will have such a big impact, sooner than you think
This presentation will give you Information about :
1.Configuring HDFS
2.Interacting With HDFS
3.HDFS Permissions and Security
4.Additional HDFS Tasks
HDFS Overview and Architecture
5.HDFS Installation
6.Hadoop File System Shell
7.File System Java API
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...Reynold Xin
(Berkeley CS186 guest lecture)
Big Data Analytics Systems: What Goes Around Comes Around
Introduction to MapReduce, GFS, HDFS, Spark, and differences between "Big Data" and database systems.
DSD-INT 2023 Hydrology User Days - Intro - Day 3 - KroonDeltares
Presentation by Timo Kroon and Nadine Slootjes (Deltares, Netherlands) at the Hydrology Suite User Days (Day 3) - Groundwater modelling, during the Delft Software Days - Edition 2023 (DSD-INT 2023). Thursday, 30 November 2023, Delft.
Presentation by Sabrina Couvin Rodriguez (Deltares, Netherlands) at the Climate Adaptation Symposium 2023, during the Delft Software Days - Edition 2023 (DSD-INT 2023). Wednesday, 29 November 2023, Delft.
Presentation by Umit Taner (Deltares, Netherlands) at the Climate Adaptation Symposium 2023, during the Delft Software Days - Edition 2023 (DSD-INT 2023). Wednesday, 29 November 2023, Delft.
Presentation by Daan Rooze (Deltares, Netherlands) at the Climate Adaptation Symposium 2023, during the Delft Software Days - Edition 2023 (DSD-INT 2023). Wednesday, 29 November 2023, Delft.
DSD-INT 2023 Approaches for assessing multi-hazard risk - WardDeltares
Presentation by Philip Ward (Deltares and IVM VU Amsterdam) at the Climate Adaptation Symposium 2023, during the Delft Software Days - Edition 2023 (DSD-INT 2023). Wednesday, 29 November 2023, Delft.
Presentation by Andrew Warren (Deltares, Netherlands) at the Climate Adaptation Symposium 2023, during the Delft Software Days - Edition 2023 (DSD-INT 2023). Wednesday, 29 November 2023, Delft.
DSD-INT 2023 Global hydrological modelling to support worldwide water assessm...Deltares
Presentation by Marc Bierkens (Utrecht University and Deltares, Netherlands) at the Climate Adaptation Symposium 2023, during the Delft Software Days - Edition 2023 (DSD-INT 2023). Wednesday, 29 November 2023, Delft.
DSD-INT 2023 Modelling implications - IPCC Working Group II - From AR6 to AR7...Deltares
Presentation by Bart van den Hurk (WGII Co-Chair, IPCC AR7, Deltares) at the Climate Adaptation Symposium 2023, during the Delft Software Days - Edition 2023 (DSD-INT 2023). Wednesday, 29 November 2023, Delft.
DSD-INT 2023 Knowledge and tools for Climate Adaptation - JeukenDeltares
Presentation by Ad Jeuken (Deltares, Netherlands) at the Climate Adaptation Symposium 2023, during the Delft Software Days - Edition 2023 (DSD-INT 2023). Wednesday, 29 November 2023, Delft.
DSD-INT 2023 Coupling RIBASIM to a MODFLOW groundwater model - BootsmaDeltares
Presentation by Huite Bootsma (Deltares, Netherlands) at the Hydrology Suite User Days (Day 3) - Groundwater modelling, during the Delft Software Days - Edition 2023 (DSD-INT 2023). Thursday, 30 November 2023, Delft.
DSD-INT 2023 Create your own MODFLOW 6 sub-variant - MullerDeltares
Presentation by Mike Muller (hydrocomputing GmbH & Co. KG, Germany) at the Hydrology Suite User Days (Day 3) - Groundwater modelling, during the Delft Software Days - Edition 2023 (DSD-INT 2023). Thursday, 30 November 2023, Delft.
DSD-INT 2023 Example of unstructured MODFLOW 6 modelling in California - RomeroDeltares
Presentation by Betsy Romero Verástegui (Deltares, Netherlands) at the Hydrology Suite User Days (Day 3) - Groundwater modelling, during the Delft Software Days - Edition 2023 (DSD-INT 2023). Thursday, 30 November 2023, Delft.
DSD-INT 2023 Challenges and developments in groundwater modeling - BakkerDeltares
Presentation by Mark Bakker (Delft University of Technology, Netherlands) at the Hydrology Suite User Days (Day 3) - Groundwater modelling, during the Delft Software Days - Edition 2023 (DSD-INT 2023). Thursday, 30 November 2023, Delft.
DSD-INT 2023 Demo new features iMOD Suite - van EngelenDeltares
Presentation by Joeri van Engelen (Deltares, Netherlands) at the Hydrology Suite User Days (Day 3) - Groundwater modelling, during the Delft Software Days - Edition 2023 (DSD-INT 2023). Thursday, 30 November 2023, Delft.
DSD-INT 2023 iMOD and new developments - DavidsDeltares
Presentation by Tess Davids (Deltares, Netherlands) at the Hydrology Suite User Days (Day 3) - Groundwater modelling, during the Delft Software Days - Edition 2023 (DSD-INT 2023). Thursday, 30 November 2023, Delft.
Presentation by Christian Langevin (U.S. Geological Survey (USGS), USA) at the Hydrology Suite User Days (Day 3) - Groundwater modelling, during the Delft Software Days - Edition 2023 (DSD-INT 2023). Thursday, 30 November 2023, Delft.
DSD-INT 2023 Hydrology User Days - Presentations - Day 2Deltares
Presentation by several speakers at the Hydrology Suite User Days (Day 2) - wflow and HydroMT, during the Delft Software Days - Edition 2023 (DSD-INT 2023). Wednesday, 29 November 2023, Delft.
DSD-INT 2023 Needs related to user interfaces - SnippenDeltares
Presentation by Edwin Snippen (Deltares, Netherlands) at the Hydrology Suite User Days (Day 1) - Hydrology Suite introduction and River Basin Management software (RIBASIM), during the Delft Software Days - Edition 2023 (DSD-INT 2023). Tuesday, 28 November 2023, Delft.
DSD-INT 2023 Coupling RIBASIM to a MODFLOW groundwater model - BootsmaDeltares
Presentation by Huite Bootsma (Deltares, Netherlands) at the Hydrology Suite User Days (Day 1) - Hydrology Suite introduction and River Basin Management software (RIBASIM), during the Delft Software Days - Edition 2023 (DSD-INT 2023). Tuesday, 28 November 2023, Delft.
DSD-INT 2023 Parameterization of a RIBASIM model and the network lumping appr...Deltares
Presentation by Harm Nomden (SWECO, Netherlands) at the Hydrology Suite User Days (Day 1) - Hydrology Suite introduction and River Basin Management software (RIBASIM), during the Delft Software Days - Edition 2023 (DSD-INT 2023). Tuesday, 28 November 2023, Delft.
Experience our free, in-depth three-part Tendenci Platform Corporate Membership Management workshop series! In Session 1 on May 14th, 2024, we began with an Introduction and Setup, mastering the configuration of your Corporate Membership Module settings to establish membership types, applications, and more. Then, on May 16th, 2024, in Session 2, we focused on binding individual members to a Corporate Membership and Corporate Reps, teaching you how to add individual members and assign Corporate Representatives to manage dues, renewals, and associated members. Finally, on May 28th, 2024, in Session 3, we covered questions and concerns, addressing any queries or issues you may have.
For more Tendenci AMS events, check out www.tendenci.com/events
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns
Unlocking Business Potential: Tailored Technology Solutions by Prosigns
Discover how Prosigns, a leading technology solutions provider, partners with businesses to drive innovation and success. Our presentation showcases our comprehensive range of services, including custom software development, web and mobile app development, AI & ML solutions, blockchain integration, DevOps services, and Microsoft Dynamics 365 support.
Custom Software Development: Prosigns specializes in creating bespoke software solutions that cater to your unique business needs. Our team of experts works closely with you to understand your requirements and deliver tailor-made software that enhances efficiency and drives growth.
Web and Mobile App Development: From responsive websites to intuitive mobile applications, Prosigns develops cutting-edge solutions that engage users and deliver seamless experiences across devices.
AI & ML Solutions: Harnessing the power of Artificial Intelligence and Machine Learning, Prosigns provides smart solutions that automate processes, provide valuable insights, and drive informed decision-making.
Blockchain Integration: Prosigns offers comprehensive blockchain solutions, including development, integration, and consulting services, enabling businesses to leverage blockchain technology for enhanced security, transparency, and efficiency.
DevOps Services: Prosigns' DevOps services streamline development and operations processes, ensuring faster and more reliable software delivery through automation and continuous integration.
Microsoft Dynamics 365 Support: Prosigns provides comprehensive support and maintenance services for Microsoft Dynamics 365, ensuring your system is always up-to-date, secure, and running smoothly.
Learn how our collaborative approach and dedication to excellence help businesses achieve their goals and stay ahead in today's digital landscape. From concept to deployment, Prosigns is your trusted partner for transforming ideas into reality and unlocking the full potential of your business.
Join us on a journey of innovation and growth. Let's partner for success with Prosigns.
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
We explore the Globus Connect Server (GCS) architecture and experiment with advanced configuration options and use cases. This content is targeted at system administrators who are familiar with GCS and currently operate—or are planning to operate—broader deployments at their institution.
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
Software Engineering, Software Consulting, Tech Lead.
Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Security,
Spring Transaction, Spring MVC,
Log4j, REST/SOAP WEB-SERVICES.
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
Accelerate Enterprise Software Engineering with PlatformlessWSO2
Key takeaways:
Challenges of building platforms and the benefits of platformless.
Key principles of platformless, including API-first, cloud-native middleware, platform engineering, and developer experience.
How Choreo enables the platformless experience.
How key concepts like application architecture, domain-driven design, zero trust, and cell-based architecture are inherently a part of Choreo.
Demo of an end-to-end app built and deployed on Choreo.
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Globus
The U.S. Geological Survey (USGS) has made substantial investments in meeting evolving scientific, technical, and policy driven demands on storing, managing, and delivering data. As these demands continue to grow in complexity and scale, the USGS must continue to explore innovative solutions to improve its management, curation, sharing, delivering, and preservation approaches for large-scale research data. Supporting these needs, the USGS has partnered with the University of Chicago-Globus to research and develop advanced repository components and workflows leveraging its current investment in Globus. The primary outcome of this partnership includes the development of a prototype enterprise repository, driven by USGS Data Release requirements, through exploration and implementation of the entire suite of the Globus platform offerings, including Globus Flow, Globus Auth, Globus Transfer, and Globus Search. This presentation will provide insights into this research partnership, introduce the unique requirements and challenges being addressed and provide relevant project progress.
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Mind IT Systems
Healthcare providers often struggle with the complexities of chronic conditions and remote patient monitoring, as each patient requires personalized care and ongoing monitoring. Off-the-shelf solutions may not meet these diverse needs, leading to inefficiencies and gaps in care. It’s here, custom healthcare software offers a tailored solution, ensuring improved care and effectiveness.
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
DSD-INT 2017 The use of big data for dredging - De Boer
1. The use of big data for dredging
Gerben de Boer, Van Oord , Engineering, OpenEarth data management
Delft Software Days 2017
2. Hire right cloud provider
• Hadoop / HD insight
• Sparq
• Cassandra
• uSQL
• CosmosDB
Relations: AI.
burn money on cloud providers
Big data philosophies: Statistics requires 30+ realizations
2
Brute force Smart force
Hire right people
• Thematic nerds (any engineering)
• Software developer (py, js, sql)
• DevOps
• Sales, social
• Graphic designer
Relations: Business logic + physics
burn money on wages
• Data scientist
• Data analytics manager
• Data architect
• Data engineer
• Statistician
• DBA
• Business analyst
• Data analyst
5. SQL has almost no limits
5
For most users SQL is not big data.
Only your wallet is a limiting factor
• Out of preview 15 nov
• 1TB
• 99.99% availability
• 35 days point-in-time restore
• We tried 0.5 TB, limited by SSD disk IO.
• 4TB
Azure postgres
Azure SQL server
Postgres in Azure VM
6. • Pure SQL
• TB SQL database no problem
• Postgres single threaded
• Use indexing, views, caching tools:
think about Content that’s needs to
be Delivered (CDN)
• Postgres native jsonb datatype
• MS uSQL can reach ascii files, and
use R and python code
Overcome SQL limits: hybrid and noSQL
6
SQL
• Put (jsonb) as files on disk
and load the subset you need,
or when replication needed
• csv, json, xml, yml, netcdf
• + many legacy formats
• Database as API, not archive
• Only index to files on disk
• E.g. Tiff postgis raster
• Van Oord vessellog = netCDF
+ PG index “(NASA
technology”):
hybrid
• Pure noSQL: structured
folder with structured
files
device/yy/mm/dd/signal
• Micro service to handle
files on demands
• Regular expressions
are your friend.
• netCDF/HDF was
originally devised to
overcome SQL limit
noSQL = files
7. OpenEarthRawData: partial checkout
Git has binary file extention. Git canot make a
partial checkout
How to get local copy of a subset of the data
7
vcs
Data to WxS on server
WxS to data by client
2 unnecesarry processing steps
WxS webservices
8. First computer was
designed to print
gonio tables
flawless.
Now we replicate
the algorithm, not
the table.
Babbage: storage, bandwidth, compute
Babbage: table vs calculator: 2 retrieval methods
Trade-off made explicit by cloud pay-as-you-go
• Storage: disk occupancy + IO operations
• Compute: CPU + Memory
• Bandwidth: too slow:
Replicate database vs replicate raw data + ETL
Cloud
Copy DB dump.
Copy raw data
and rerun ETL.
13. Good idea to stream graphics to screens: WMS.
Limits grid data to what you can actually see
People actually use quad-trees, not WMTS: tiled.
Use (geo)json for plotting vector data: plot.ly
geojson only OGC in 2017, 9 years after conception !
Bad idea to stream big data: WCS, WFS
Keep all processing in the datacenters.
Only graphical results.
INSPIRE + OGC: not front-runners.
WXS > CDN
13
WXS
• CDN - content delivery network
• The backbone behind youtube, netflix
• Makes datacenters geospatially redundant
• Rapidly replicates raw data files (tiff)
• Use your own ETL tools locally
CDN
16. Overload of historic data formats: parsing
Datawell wave buoy: 30 kB code to parse 93 bytes
OGC SOS is not a solution:
xml garbage.
Satellite data still very expensive
Solutions are available:
Google protobuffers
Variety: parsing is ETL
16
Sensor supplier, SCADA
17. ETL processes are run once
Database is considered archive
ETL removes some raw data features
Collect once, maybe re-use many times
Parsers do not evolve: waterfall
Good for: known knowns
Share data and processing (Manhattan optimization)
17
ETL
In ELT the generic parsers run each request
Parsers can run on-the-fly in a micro-service
All raw data features can be kept as parsers evolve
Collect once, allow any future use
Parsers evolve agile: extra from_* methods
Good for: unknown unknowns
ELT: share code via github !
parser.to_sql()
parser.from_garbage()
18. • SQL server can now un R and python code
• Windows and linux can run same containers
Big unstructured Datalake
• SQL sources + noSQL sources
• Brute force to run ELT jobs: Hadoop
• Economic trade-off brains vs clouds
Datalake
18
Datalake
18
Codelake
parser.from_garbage()
parser.from_garbage()
parser.from_garbage()
parser.from_garbage()
parser.from_garbage()
19. L0 raw data
L0_L1 code
L1 products
L1_L2 code
L2 products
…
Big data reinvented wheel (2)
19
Big data reinvented the wheel
21. Run micro services on top of Datalake
One for each specific question.
This software needs to work at any data replication
• Localhost
• Azure
• Amazon
• On-premise
• On-vessel
We need to make servers redistributable
CONTAINERS
Micro services
Datalake
22. OpenEarth: monthly Docker sprintsession @ Microsoft NL, Schiphol
22
Van Oord, Deltares, Tu Delft, KNMI, NLeSC, Sogeti, Microsoft, Maris, …
23. • Docker sprint session every month
• https://github.com/openearth-stack
• Van Oord, Tu Delft, Deltares, Microsoft
NLeSC, KNMI, Maris
• Gerben.deboer@vanoord.com
OpenEarth Docker Azure DigiShape
23
Organization
• Pyramid python web framework
• PostgreSQL
• KNMI Adaguc
• Geoserver
• ….
Components
25. Excel is our only Big data nightmare
Old, grey clerks and managers.
The use Excel as paper.
Manual data can be digitized with rapid apps.
Low-code revolution: app-in-a-day.
Variety
25
Low-code Apps
http://www.janbanning.com/
26. Excel course: who ever read the instructions?
https://danjharrington.wordpress.com/2012/08/01/excel-logos-over-the-years/ Gerben J de Boer, Van Oord, E&E, OpenEarth Data Management