Developed for the Denver Art Museum by Ashley Blewer, this slide-deck covers some of the basics of diagnosing issues with Archivematica. Ashley covers everything from the software components involved with Archivematica, to monitoring logs, system monitoring, and upgrading your system. The presentation concludes with some useful links for tech-savvy preservationists, and Archivematica-unfamiliar system's administrators!
Developed for DANS-KNAW. This presentation covers some of the fundamentals of the automation-tools. Helper scripts for automation of transfers in Archivematica. Designed to complement the API slide-deck, the two resources can probably be consumed in any order. Knowing the API will help you understand the automation-tools, but knowing the automation-tools may help you understand what you want to create using the API.
API slide-deck here: https://www.slideshare.net/Archivematica/introduction-to-the-archivematica-api-september-2018-122548752
Developed for the University of Denver this presentation covers some of the most fundamental, yet, most important functions that are available in the Archivematica API. From discovering transfer locations to initiating and approving a transfer, a large part of what is required to automate your transfer workflows can be discovered herein.
There is now a complementary automation-tools slide-deck. The two resources can probably be consumed in any order. Knowing the API will help you understand the automation-tools, but knowing the automation-tools may help you understand what you want to create using the API.
Automation-tools slide-deck here: https://www.slideshare.net/Archivematica/automation-tools-making-things-go-march-2019
Presentation given by Tim Walsh at Archivematica Camp Baltimore 2018 about his and the Canadian Center for Architecture's experience with the Archivematica Automation Tools.
Virtual Flink Forward 2020: Build your next-generation stream platform based ...Flink Forward
As organizations are getting better at capturing streaming data and the data velocity and volume are ever-increasing, the traditional messaging queues or log storage systems are suffering from scalability or operational and maintenance problems. Apache Pulsar is a multi-tenant, high-performance distributed pub-sub messaging system. Pulsar includes multiple features, such as native support for multiple clusters in a Pulsar instance, seamless geo-replication of messages across clusters, very low publishing and end-to-end latency, seamless scalability to over a million topics, and guaranteed message delivery with persistent message storage provided by Apache BookKeeper. In this talk, I will use one of the most popular stream processing engines, Apache Flink, as an example, to share our experience in building a stream processing and storage stack. Some of the traits are: * How to ensure end-to-end exactly-once semantics based on Pulsar's durable and replayable storage as well as Pulsar transaction. * How to implement Pulsar topics as infinite tables based on Pulsar's schema. * How to efficiently store stream states in Pulsar based on Pulsar's layered storage API. * A usage scenario that chaining all functionalities in the streaming platform.
Flink Forward Berlin 2017: Patrick Lucas - Flink in ContainerlandFlink Forward
Apache Flink, a powerful distributed stateful stream processing framework, is an especially good fit for deployment on a containerization platform: its storage requirement is primarily external (e.g. HDFS or S3), clusters often share the lifetime of the jobs that run on them, and the flexibility of allocating resources on such a platform allows for scaling jobs up and down as necessary. In this talk I will give a brief introduction to Apache Flink, then describe the journey to making it a first-class citizen of the container world. I will cover my experience preparing to publish the “official repository” of Flink images on Docker Hub, the challenges of fitting a Flink deployment in a Kubernetes-shaped box, and the rough edges of Flink itself that were exposed by this process.
Flink Forward Berlin 2017: Dominik Bruhn - Deploying Flink Jobs as Docker Con...Flink Forward
This talk will focus on how to package, distribute and deploy Flink Jobs by leveraging existing docker technology: Previously deploying of Flink Jobs has been a manual job which leads into errors. In this talk, we present an approach which works well in an CI/CD environment by automating most steps: From the code of a Flink Job in a repository to a running Job on an YARN cluster.
Flink Connector Development Tips & TricksEron Wright
A look at some of the challenges and techniques for developing a connector for Apache Flink, covering the different types of connectors, lifecycle, metrics, event-time support, and fault tolerance.
Presentation video: https://www.youtube.com/watch?v=ZkbYO5S4z18
Developed for DANS-KNAW. This presentation covers some of the fundamentals of the automation-tools. Helper scripts for automation of transfers in Archivematica. Designed to complement the API slide-deck, the two resources can probably be consumed in any order. Knowing the API will help you understand the automation-tools, but knowing the automation-tools may help you understand what you want to create using the API.
API slide-deck here: https://www.slideshare.net/Archivematica/introduction-to-the-archivematica-api-september-2018-122548752
Developed for the University of Denver this presentation covers some of the most fundamental, yet, most important functions that are available in the Archivematica API. From discovering transfer locations to initiating and approving a transfer, a large part of what is required to automate your transfer workflows can be discovered herein.
There is now a complementary automation-tools slide-deck. The two resources can probably be consumed in any order. Knowing the API will help you understand the automation-tools, but knowing the automation-tools may help you understand what you want to create using the API.
Automation-tools slide-deck here: https://www.slideshare.net/Archivematica/automation-tools-making-things-go-march-2019
Presentation given by Tim Walsh at Archivematica Camp Baltimore 2018 about his and the Canadian Center for Architecture's experience with the Archivematica Automation Tools.
Virtual Flink Forward 2020: Build your next-generation stream platform based ...Flink Forward
As organizations are getting better at capturing streaming data and the data velocity and volume are ever-increasing, the traditional messaging queues or log storage systems are suffering from scalability or operational and maintenance problems. Apache Pulsar is a multi-tenant, high-performance distributed pub-sub messaging system. Pulsar includes multiple features, such as native support for multiple clusters in a Pulsar instance, seamless geo-replication of messages across clusters, very low publishing and end-to-end latency, seamless scalability to over a million topics, and guaranteed message delivery with persistent message storage provided by Apache BookKeeper. In this talk, I will use one of the most popular stream processing engines, Apache Flink, as an example, to share our experience in building a stream processing and storage stack. Some of the traits are: * How to ensure end-to-end exactly-once semantics based on Pulsar's durable and replayable storage as well as Pulsar transaction. * How to implement Pulsar topics as infinite tables based on Pulsar's schema. * How to efficiently store stream states in Pulsar based on Pulsar's layered storage API. * A usage scenario that chaining all functionalities in the streaming platform.
Flink Forward Berlin 2017: Patrick Lucas - Flink in ContainerlandFlink Forward
Apache Flink, a powerful distributed stateful stream processing framework, is an especially good fit for deployment on a containerization platform: its storage requirement is primarily external (e.g. HDFS or S3), clusters often share the lifetime of the jobs that run on them, and the flexibility of allocating resources on such a platform allows for scaling jobs up and down as necessary. In this talk I will give a brief introduction to Apache Flink, then describe the journey to making it a first-class citizen of the container world. I will cover my experience preparing to publish the “official repository” of Flink images on Docker Hub, the challenges of fitting a Flink deployment in a Kubernetes-shaped box, and the rough edges of Flink itself that were exposed by this process.
Flink Forward Berlin 2017: Dominik Bruhn - Deploying Flink Jobs as Docker Con...Flink Forward
This talk will focus on how to package, distribute and deploy Flink Jobs by leveraging existing docker technology: Previously deploying of Flink Jobs has been a manual job which leads into errors. In this talk, we present an approach which works well in an CI/CD environment by automating most steps: From the code of a Flink Job in a repository to a running Job on an YARN cluster.
Flink Connector Development Tips & TricksEron Wright
A look at some of the challenges and techniques for developing a connector for Apache Flink, covering the different types of connectors, lifecycle, metrics, event-time support, and fault tolerance.
Presentation video: https://www.youtube.com/watch?v=ZkbYO5S4z18
Modern software development is increasingly taking a “microservice” approach that has resulted in an explosion of complexity at the network level. We have more applications running distributed across different datacenters. Distributed tracing, events, and metrics are essential for observing and understanding modern microservice architectures.
This talk is a deep dive on how to monitor your distributed system. You will get tools, methodologies, and experiences that will help you to realize what your applications expose and how to get value out from all these information.
Gianluca Arbezzano, SRE at InfluxData will share how to monitor a distributed system, how to switch from a more traditional monitoring approach to observability. Stay focused on the server’s role and not on the hostname because it’s not really important anymore, our servers or containers are fast moving part and it’s easy to detach it from the right in case of trouble than call the server by name as a cute puppet. How to design a SLO for your core services and now to iterate on them. Instrument your services with tracing using tools like Zipkin or Jaeger to measure latency between in your network.
Flink Forward Berlin 2017: Andreas Kunft - Efficiently executing R Dataframes...Flink Forward
While dataflow engines offer scalability, their programming abstractions are often unfamiliar to data scientists, which are used to Python and R. To provide a more convenient interface, dataflow engines like Spark provide an R-like dataframe abstraction. While operations without user-defined code can be executed efficiently, the execution of UDFs is dominated by serialized data exchange between the dataflow engine and an external R process that evaluates the code. We present a new approach to execute user-defined functions by using the Truffle/Graal compiler infrastructure, which enables efficient execution of dynamic languages on the JVM. Based on fastR, the R language provided by this infrastructure, we exemplify the execution of R scripts directly inside the data pipelines of Flink, without data serialization and inter-process communication. Furthermore, we discuss future opportunities and problems, and compare our approach to native Flink, Spark, and SparkR.
Keystone Data Pipeline manages several thousand Flink pipelines, with variable workloads. These pipelines are simple routers which consume from Kafka and write to one of three sinks. In order to alleviate our operational overhead, we’ve implemented autoscaling for our routers. Autoscaling has reduced our resource usage by 25% - 45% (varying by region and time), and has reduced our on call burden. This talk will take an in depth look at the mathematics, algorithms, and infrastructure details for implementing autoscaling of simple pipelines at scale. It will also discuss future work for autoscaling complex pipelines.
GrafanaCon 2015 - http://grafanacon.org/
Tobias will be giving an overview of Prometheus, an open-source monitoring system with a multi-dimensional label system, expressive query language and dashboard editor called PromDash. Learn about the highlights and differences of PromDash compared to Grafana and discuss the options to make Grafana the primary dashboard editor of the Prometheus project.
This is a talk on how you can monitor your microservices architecture using Prometheus and Grafana. This has easy to execute steps to get a local monitoring stack running on your local machine using docker.
Python is popular amongst data scientists and engineers for data processing tasks. The big data ecosystem has traditionally been rather JVM centric. Often Java (or Scala) are the only viable option to implement data processing pipelines. That sometimes poses an adoption barrier for organizations that have already invested in other language ecosystems. The Apache Beam project provides a unified programming model for data processing and its ongoing portability effort aims to enable multiple language SDKs (currently Java, Python and Go) on a common set of runners. The combination of Python streaming on the Apache Flink runner is one example. Let’s take a look how the Flink runner translates the Beam model into the native DataStream (or DataSet) API, how the runner is changing to support portable pipelines, how Python user code execution is coordinated with gRPC based services and how a sample pipeline runs on Flink.
In this video you are going to learn what is an operator in Apache Airflow. There are multiple kinds of operator such as Action Operator, Sensor Operator and Transfer Operator and it's important to know why and when to use one over another.
If you want to access to the entire course and support my work go to
https://www.udemy.com/the-complete-hands-on-course-to-master-apache-airflow/?couponCode=YOUTUBE-AIRFLOW
Thank you very much and have a good learning day :)
Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...Flink Forward
Many stream processing applications can benefit from or need to rely on the prediction made with machine learning (ML) methods. In this presentation, new features of Apache Samoa are presented with a real data processing scenario. These features make Apache SAMOA fully accessible for Apache Flink users: (1) the data stream processed within Apache Flink is forwarded to Apache Samoa stream mining engine to perform predictions with stream-oriented ML models, (2) ML models evolve after every labelled instance and, at the same time, new predictions are sent back to Apache Flink. In both cases, Apache Kafka is used for data exchange. Hence, Apache Samoa is used as stream mining engine, provided with input data from, and sending predictions to Apache Flink. During the presentation, real life aspects are illustrated with code examples, such as input and prediction stream integration and monitoring latency of data processing and stream mining.
Coprocessors - Uses, Abuses, Solutions - presented at HBaseCon East 2016Esther Kundin
Presentation from HBaseCon East 2016
Coprocessors: Uses, Abuses, Solutions
This talk details common issues associated with coprocessor use and deployment, as well as some of the workarounds that our team at Bloomberg used.
The final section was presented by Clay Baenziger
Group of Airflow core committers talking about what's coming with Airflow 2.0!
Speakers: Ash Berlin-Taylor, Kaxil Naik, Kamil Breguła Jarek Potiuk, Daniel Imberman and Tomasz Urbaszek.
Linux Server Deep Dives (DrupalCon Amsterdam)Amin Astaneh
Over the past few years the Linux kernel has gained features that allow us to learn more about what's really happening on our servers and the applications that run on them.
This talk will explore how these new features, particularly perf_events and ebpf, enable us to answer questions about what a Drupal site is doing in real time beyond what the standard logs, server performance tools, and even strace will reveal. Attendees will be provided a brief introduction to example uses of these tools to diagnose performance problems.
This talk is intended for attendees that are familiar with Linux, the command line, and have used host observability tools in the past (top, netstat, etc).
linux monitoring and performance tunning iman darabi
howto monitor linux server? what metrics are important when monitor server? what is related between metrics and monitoring tools? what are basic linux server optimization ? howto optimize ?
Modern software development is increasingly taking a “microservice” approach that has resulted in an explosion of complexity at the network level. We have more applications running distributed across different datacenters. Distributed tracing, events, and metrics are essential for observing and understanding modern microservice architectures.
This talk is a deep dive on how to monitor your distributed system. You will get tools, methodologies, and experiences that will help you to realize what your applications expose and how to get value out from all these information.
Gianluca Arbezzano, SRE at InfluxData will share how to monitor a distributed system, how to switch from a more traditional monitoring approach to observability. Stay focused on the server’s role and not on the hostname because it’s not really important anymore, our servers or containers are fast moving part and it’s easy to detach it from the right in case of trouble than call the server by name as a cute puppet. How to design a SLO for your core services and now to iterate on them. Instrument your services with tracing using tools like Zipkin or Jaeger to measure latency between in your network.
Flink Forward Berlin 2017: Andreas Kunft - Efficiently executing R Dataframes...Flink Forward
While dataflow engines offer scalability, their programming abstractions are often unfamiliar to data scientists, which are used to Python and R. To provide a more convenient interface, dataflow engines like Spark provide an R-like dataframe abstraction. While operations without user-defined code can be executed efficiently, the execution of UDFs is dominated by serialized data exchange between the dataflow engine and an external R process that evaluates the code. We present a new approach to execute user-defined functions by using the Truffle/Graal compiler infrastructure, which enables efficient execution of dynamic languages on the JVM. Based on fastR, the R language provided by this infrastructure, we exemplify the execution of R scripts directly inside the data pipelines of Flink, without data serialization and inter-process communication. Furthermore, we discuss future opportunities and problems, and compare our approach to native Flink, Spark, and SparkR.
Keystone Data Pipeline manages several thousand Flink pipelines, with variable workloads. These pipelines are simple routers which consume from Kafka and write to one of three sinks. In order to alleviate our operational overhead, we’ve implemented autoscaling for our routers. Autoscaling has reduced our resource usage by 25% - 45% (varying by region and time), and has reduced our on call burden. This talk will take an in depth look at the mathematics, algorithms, and infrastructure details for implementing autoscaling of simple pipelines at scale. It will also discuss future work for autoscaling complex pipelines.
GrafanaCon 2015 - http://grafanacon.org/
Tobias will be giving an overview of Prometheus, an open-source monitoring system with a multi-dimensional label system, expressive query language and dashboard editor called PromDash. Learn about the highlights and differences of PromDash compared to Grafana and discuss the options to make Grafana the primary dashboard editor of the Prometheus project.
This is a talk on how you can monitor your microservices architecture using Prometheus and Grafana. This has easy to execute steps to get a local monitoring stack running on your local machine using docker.
Python is popular amongst data scientists and engineers for data processing tasks. The big data ecosystem has traditionally been rather JVM centric. Often Java (or Scala) are the only viable option to implement data processing pipelines. That sometimes poses an adoption barrier for organizations that have already invested in other language ecosystems. The Apache Beam project provides a unified programming model for data processing and its ongoing portability effort aims to enable multiple language SDKs (currently Java, Python and Go) on a common set of runners. The combination of Python streaming on the Apache Flink runner is one example. Let’s take a look how the Flink runner translates the Beam model into the native DataStream (or DataSet) API, how the runner is changing to support portable pipelines, how Python user code execution is coordinated with gRPC based services and how a sample pipeline runs on Flink.
In this video you are going to learn what is an operator in Apache Airflow. There are multiple kinds of operator such as Action Operator, Sensor Operator and Transfer Operator and it's important to know why and when to use one over another.
If you want to access to the entire course and support my work go to
https://www.udemy.com/the-complete-hands-on-course-to-master-apache-airflow/?couponCode=YOUTUBE-AIRFLOW
Thank you very much and have a good learning day :)
Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...Flink Forward
Many stream processing applications can benefit from or need to rely on the prediction made with machine learning (ML) methods. In this presentation, new features of Apache Samoa are presented with a real data processing scenario. These features make Apache SAMOA fully accessible for Apache Flink users: (1) the data stream processed within Apache Flink is forwarded to Apache Samoa stream mining engine to perform predictions with stream-oriented ML models, (2) ML models evolve after every labelled instance and, at the same time, new predictions are sent back to Apache Flink. In both cases, Apache Kafka is used for data exchange. Hence, Apache Samoa is used as stream mining engine, provided with input data from, and sending predictions to Apache Flink. During the presentation, real life aspects are illustrated with code examples, such as input and prediction stream integration and monitoring latency of data processing and stream mining.
Coprocessors - Uses, Abuses, Solutions - presented at HBaseCon East 2016Esther Kundin
Presentation from HBaseCon East 2016
Coprocessors: Uses, Abuses, Solutions
This talk details common issues associated with coprocessor use and deployment, as well as some of the workarounds that our team at Bloomberg used.
The final section was presented by Clay Baenziger
Group of Airflow core committers talking about what's coming with Airflow 2.0!
Speakers: Ash Berlin-Taylor, Kaxil Naik, Kamil Breguła Jarek Potiuk, Daniel Imberman and Tomasz Urbaszek.
Linux Server Deep Dives (DrupalCon Amsterdam)Amin Astaneh
Over the past few years the Linux kernel has gained features that allow us to learn more about what's really happening on our servers and the applications that run on them.
This talk will explore how these new features, particularly perf_events and ebpf, enable us to answer questions about what a Drupal site is doing in real time beyond what the standard logs, server performance tools, and even strace will reveal. Attendees will be provided a brief introduction to example uses of these tools to diagnose performance problems.
This talk is intended for attendees that are familiar with Linux, the command line, and have used host observability tools in the past (top, netstat, etc).
linux monitoring and performance tunning iman darabi
howto monitor linux server? what metrics are important when monitor server? what is related between metrics and monitoring tools? what are basic linux server optimization ? howto optimize ?
How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...Red Hat Developers
The fifth major release of Hibernate sports contains many internal changes developed in collaboration between the Hibernate team and the Red Hat middleware performance team. Efficient access to databases is crucial to get scalable and responsive applications. Hibernate 5 received much attention in this area. You’ll benefit from many of these improvements by merely upgrading. But it's important to understand some of these new, performance-boosting features because you will need to explicitly enable them. We'll explain the development background on all of these powerful new features and the investigation process for performance improvements. Our aim is to provide good guidance so you can make the most of it on your own applications. We'll also peek at other performance improvements made on JBoss EAP 7, like on the caching layer, the connection manager, and the web tier. We want to make sure you can all enjoy better-performing applications—that require less power and less servers—without compromising on your developer’s productivity.
You’re ready to make your applications more responsive, scalable, fast and secure. Then it’s time to get started with NGINX. In this webinar, you will learn how to install NGINX from a package or from source onto a Linux host. We’ll then look at some common operating system tunings you could make to ensure your NGINX install is ready for prime time.
View full webinar on demand at http://nginx.com/resources/webinars/installing-tuning-nginx/
21 people attended the July 2014 program meeting hosted by BDPA Cincinnati chapter. The topic was 'Open Source Tools and Resources'. The guest speaker was Greg Greenlee (Blacks In Technology).
'Open source' refers to a computer program in which the source code is available to the general public for use or modification from its original design. Open source code is typically created as a collaborative effort in which programmers improve upon the code and share the changes within the community. Open source sprouted in the technological community as a response to proprietary software owned by corporations. Over 85% of enterprises are using open source software. Managers are quickly realizing the benefit that community-based development can have on their businesses. This month, we put on our geek hats and detective gloves to learn how we can monitor our computers’ environments using open source tools. This meetup covered some of the most popular ‘Free and Open Source Software’ (FOSS) tools used to monitor various aspects of your computer environment.
Nagios Conference 2014 - Eric Mislivec - Getting Started With Nagios CoreNagios
Eric Mislivec's presentation on getting started with Nagios Core. The presentation was given during the Nagios World Conference North America held Oct 13th - Oct 16th, 2014 in Saint Paul, MN. For more information on the conference (including photos and videos), visit: http://go.nagios.com/conference.
Prometheus - Intro, CNCF, TSDB,PromQL,GrafanaSridhar Kumar N
https://www.youtube.com/playlist?list=PLAiEy9H6ItrKC5PbH7KiELiSEIKv3tuov
-What is Prometheus?
-Difference Between Nagios vs Prometheus
-Architecture
-Alertmanager
-Time series DB
-PromQL (Prometheus Query Language)
-Live Demo
-Grafana
An operating system (OS) is a software program that manages the resources of a computer system and provides a platform for running applications. Its primary functions include resource management, process management, memory management, file system management, and user interface. There are many different types of operating systems, such as desktop operating systems like Windows and macOS, server operating systems like Linux and Windows Server, and embedded operating systems like those used in mobile phones and other small devices. The choice of operating system depends on the type of device, the intended use, and other factors.
Nagios Conference 2011 - Daniel Wittenberg - Scaling Nagios At A Giant Insur...Nagios
Daniel Wittenburg' presentation on a reference story for a German Health Insurance Company. The presentation was given during the Nagios World Conference North America held Sept 27-29th, 2011 in Saint Paul, MN. For more information on the conference (including photos and videos), visit: http://go.nagios.com/nwcna
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA SolutionsNagios
Andy Brist's presentation on High Availability and Failover Solutions for Nagios XI. The presentation was given during the Nagios World Conference North America held Oct 13th - Oct 16th, 2014 in Saint Paul, MN. For more information on the conference (including photos and videos), visit: http://go.nagios.com/conference
These slides accompany a 1.5 hour webinar sponsored by the Western New York Library Resources Council, presented by Dan Gillean of Artefactual Systems on February 15th, 2017.
The session was intended to introduce participants to some of the key standards, services, and tools available to support digital preservation planning and activities. Part 1 focused on DP101, and how to begin tackling digital preservation in your institution. Part 2 introduced the Archivematica project's history, philosophy, and aims, while Part 3 was a live demonstration of Archivematica in action.
Thank you to WNYLRC for sponsoring this event!
Slides accompanying a presentation given by Dan Gillean of Artefactual Systems at the PERICLES/DPC joint conference and meeting, "Acting on Change: New Approaches and Future Practices in LTDP," held in London at the Wellcome Collection Conference Center, Nov 30 - Dec 2, 2016.
The talk examines the question of the Capacity Gap - why is it that we have so many tools, services, standards, models, and metrics to support digital preservation, but so many organizations feel they do not have the capacity or capability to begin tackling digital preservation within their institution?
The presentation offers a different take based on Dan's experience working as an analyst and consultant for a software development company engaging with many different types of organizations and individuals in the cultural heritage sector. While acknowledging that the under-resourced nature of cultural heritage work plays a key role, this presentation examines some oft-encountered perceptual or cognitive barriers to getting started with digital preservation. It then provides some suggestions on how to overcome these barriers, acknowledging that anything is better than nothing when it comes to DP, and that sometimes perfect can be the enemy of good.
Slides accompanying a presentation by Dan Gillean, delivered at the Glenstone Digital Preservation Roundtable in Potomac, Maryland, November 4th, 2016.
These slides introduce Archivematica's approach to supporting digital preservation worfklows, and our development philosophy behind the application.
Slides accompanying a talk delivered by Dan Gillean at PASIG 2016, held at the Museum of Modern Art in New York, NY October 26-28, 2016.
These slides explore the roles that standards play in digital preservation, and introduce some of the key standards that Archivematica was designed with in mind, and which the system uses to help you capture technical, preservation, and administrative metadata when generating Archival Information Packages (AIPs) and Dissemination Information Packages (DIPs).
For more information about Archivematica, see: https://www.archivematica.org
Presentation to the PREMIS Implementation Fair at iPRES 2016, about how PREMIS in METS metadata is implemented in the Archivematica digital preservation system.
Slides accompanying a brief talk given as part of the Archivematica User Group meeting at #SAA2016, the Society of American Archivists 2016 conference in Atlanta, GA. The user group meeting was held on August 3rd Room 309/310 in the Hilton Atlanta.
These slides offer Archivematica users a brief update on the features included in the current 1.5 release and what's on the roadmap for future releases, as well as discussion of related events and resources such as the first ArchivematiCamp in August, screencasts, and more.
Slides for a presentation made at the Archives Association of British Columbia's 2016 Annual Conference, April 15, 2016, held in Vancouver, BC, Canada.
The slides aim to provide users with a basic introduction to some of the key considerations when implementing a digital preservation plan, describing the workflow with a series of cooking-related references.
Presentation to Toronto Area Archivists' Group, September 11th 2015. See also slide notes: http://www.slideshare.net/Archivematica/getting-started-with-atom-and-archivematica-for-digital-preservation-and-access-notes
These slides are the basis of an Open Repositories 2015 talk about Archivematica integration.
Abstract: The open repository ecosystem consists of many interlocking systems which satisfy needs at different points in content management workflows, and these differ within and among institutions. Archivematica is a digital preservation system which aims to integrate with existing repository, storage and access systems in order to leverage the resources that institutions have invested towards building their repository over time. The presentation will cover every integration the Archivematica project has completed thus far, including Dspace and DuraCloud, LOCKSS, Islandora/Fedora, Archivists' Toolkit, AccessToMemory (AtoM), CONTENTdm, Arkivum, HP Trim, and OpenStack, as well as ongoing projects with ArchivesSpace, Dataverse, and BitCurator. Each of these projects has had its own set of limitations in scope because of the requirements of the project sponsor and/or the limitations of other system, so in many ways several of them are not, and may never be 'complete' integrations. The discussion will explore what that means and strategies for expanding the functional capabilities of integration work over time. It will address scoping integration workflows and building requirements with limitations on functionality and resources. We will examine how systems can be built and enhanced in ways that accommodate diverse workflows and varied interlocking endpoints.
Presentation slides from demonstration of hierarchical (or, arranged) DIPs from Archivematica to AtoM. Functionality to be available in Archivematica version 1.5 and AtoM version 2.2.
Report on two projects which used Archivematica installed in a Cloud hosted environment: Council of Prairie and Pacific University Libraries, and ArchivesDirect
More from Artefactual Systems - Archivematica (20)
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
In this slide, we show the simulation example and the way to compile this solver.
In this solver, the Helmholtz equation can be solved by helmholtzFoam. Also, the Helmholtz equation with uniformly dispersed bubbles can be simulated by helmholtzBubbleFoam.
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
Even though at surface level ‘java.lang.OutOfMemoryError’ appears as one single error; underlyingly there are 9 types of OutOfMemoryError. Each type of OutOfMemoryError has different causes, diagnosis approaches and solutions. This session equips you with the knowledge, tools, and techniques needed to troubleshoot and conquer OutOfMemoryError in all its forms, ensuring smoother, more efficient Java applications.
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
Navigating the Metaverse: A Journey into Virtual Evolution"Donna Lenk
Join us for an exploration of the Metaverse's evolution, where innovation meets imagination. Discover new dimensions of virtual events, engage with thought-provoking discussions, and witness the transformative power of digital realms."
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisGlobus
JASMIN is the UK’s high-performance data analysis platform for environmental science, operated by STFC on behalf of the UK Natural Environment Research Council (NERC). In addition to its role in hosting the CEDA Archive (NERC’s long-term repository for climate, atmospheric science & Earth observation data in the UK), JASMIN provides a collaborative platform to a community of around 2,000 scientists in the UK and beyond, providing nearly 400 environmental science projects with working space, compute resources and tools to facilitate their work. High-performance data transfer into and out of JASMIN has always been a key feature, with many scientists bringing model outputs from supercomputers elsewhere in the UK, to analyse against observational or other model data in the CEDA Archive. A growing number of JASMIN users are now realising the benefits of using the Globus service to provide reliable and efficient data movement and other tasks in this and other contexts. Further use cases involve long-distance (intercontinental) transfers to and from JASMIN, and collecting results from a mobile atmospheric radar system, pushing data to JASMIN via a lightweight Globus deployment. We provide details of how Globus fits into our current infrastructure, our experience of the recent migration to GCSv5.4, and of our interest in developing use of the wider ecosystem of Globus services for the benefit of our user community.
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
Enterprise Resource Planning System includes various modules that reduce any business's workload. Additionally, it organizes the workflows, which drives towards enhancing productivity. Here are a detailed explanation of the ERP modules. Going through the points will help you understand how the software is changing the work dynamics.
To know more details here: https://blogs.nyggs.com/nyggs/enterprise-resource-planning-erp-system-modules/
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Anthony Dahanne
Les Buildpacks existent depuis plus de 10 ans ! D’abord, ils étaient utilisés pour détecter et construire une application avant de la déployer sur certains PaaS. Ensuite, nous avons pu créer des images Docker (OCI) avec leur dernière génération, les Cloud Native Buildpacks (CNCF en incubation). Sont-ils une bonne alternative au Dockerfile ? Que sont les buildpacks Paketo ? Quelles communautés les soutiennent et comment ?
Venez le découvrir lors de cette session ignite
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
A Comprehensive Look at Generative AI in Retail App Testing.pdfkalichargn70th171
Traditional software testing methods are being challenged in retail, where customer expectations and technological advancements continually shape the landscape. Enter generative AI—a transformative subset of artificial intelligence technologies poised to revolutionize software testing.
Accelerate Enterprise Software Engineering with PlatformlessWSO2
Key takeaways:
Challenges of building platforms and the benefits of platformless.
Key principles of platformless, including API-first, cloud-native middleware, platform engineering, and developer experience.
How Choreo enables the platformless experience.
How key concepts like application architecture, domain-driven design, zero trust, and cell-based architecture are inherently a part of Choreo.
Demo of an end-to-end app built and deployed on Choreo.
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Mind IT Systems
Healthcare providers often struggle with the complexities of chronic conditions and remote patient monitoring, as each patient requires personalized care and ongoing monitoring. Off-the-shelf solutions may not meet these diverse needs, leading to inefficiencies and gaps in care. It’s here, custom healthcare software offers a tailored solution, ensuring improved care and effectiveness.
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
top nidhi software solution freedownloadvrstrong314
This presentation emphasizes the importance of data security and legal compliance for Nidhi companies in India. It highlights how online Nidhi software solutions, like Vector Nidhi Software, offer advanced features tailored to these needs. Key aspects include encryption, access controls, and audit trails to ensure data security. The software complies with regulatory guidelines from the MCA and RBI and adheres to Nidhi Rules, 2014. With customizable, user-friendly interfaces and real-time features, these Nidhi software solutions enhance efficiency, support growth, and provide exceptional member services. The presentation concludes with contact information for further inquiries.
5. Supporting technology
● Python: programming language
● Django: web application framework
● Gearman: job scheduler
● MySQL: relational database
● Elasticsearch: search index
● Nginx: web server (can be apache)
● Gunicorn: interface between Python and Nginx
● git: version control system
● Ansible/Docker: deployment/configuration management
6. All on Linux
● Ubuntu 16.04 or 18.04
● CentOS 7 or Red Hat
7. Format Policy Registry
● Tools we use to perform preservation actions
● Rules we use to determine when to use the Tools
● Commands are applied to files based on the Rules
10. Technical stack
● Lots of tools = lots of potential points of failure
● Archivematica strives to relay as much information as
possible to the user -- especially about what the tools are
doing and what they are producing
11. Components
● Dashboard: for the user
● MCPClient: does the work
● MCPServer: manages the work
● Storage Service: manages storage
12. Logging in
● Logging in (ssh)
● Moving files (scp)
● What’s running (ps -sf | grep py)
● How much space? (du)
● How much free space? (df -h)
● Load average time? (top)
● Read end of logs (tail)
● Read logs (less)
14. Moving files
Download a file to your computer
scp
your_username@remotehost.url:your-file.txt
/your/local/directory
Send a file to your machine
scp path/to/your-file.txt
your_username@remotehost.url:/some/remote/di
rectory
15. What’s running?
ps -ef | grep py
These services should all be running:
● Dashboard (apache)
● Database (mysql)
● Elasticsearch (elastic)
● Storage Service (uwsgi or nginx)
● FITS
● Server (MCP) -- Should show MCP server and MCP client
16. What’s running?
ps -ef | grep py
Also, these dependent services should all be running:
● MySQL
● Elasticsearch
● Gearman
● Nginx
● Nailgun
● Clamav
17. du
To get the file size of each subdirectory of the directory you
are in, you can run this command:
du -h --max-depth=1
This command can take a long time if you have very large
mounted drives.
See amount of space on machine
18. Check free space on disk
df -h
● Up to 3x of free space required for processing
● cron job can auto-clear deleted/rejected files
20. Restarting services
service archivematica-dashboard restart
service archivematica-mcp-client.service restart
service archivematica-mcp-server.service restart
service archivematica-storage-service restart
service gearmand restart
21. Reading logs
less /var/log/archivematica/dashboard/dashboard.log
less /var/log/archivematica/dashboard/dashboard.debug.log
less /var/log/archivematica/MCPClient/MCPClient.log
less /var/log/archivematica/MCPClient/MCPClient.debug.log
less /var/log/archivematica/MCPServer/MCPServer.log
less /var/log/archivematica/MCPServer/MCPServer.debug.log
less /var/log/archivematica/storage-service/storage-service.log
less /var/log/archivematica/storage-service/storage-service.debug.log
24. Upgrading
● Need to decide on a new release whether you want it or not, how much
time to put aside.
● Tradeoff to not upgrading is not keeping pace with community and having
a harder time getting support from community for an older version.
● Good idea to test the upgrade- make a backup of your production
environment and test upgrade there. If that is not possible, plan for
downtime.
○ If you want to be able to do this, you might want to explore
virtualization of your Archivematica environment so you can run a
development (testing) environment in addition to the production
environment.
25. Security upgrades
● Make sure that Ubuntu is set-up to do Unattended Upgrades, which will
apply security patches (like equivalent of Windows updates).
● Sometimes these upgrades require the system to be restarted- you might
need to plan for 30 minutes of downtime (not in the middle of processing,
make sure your current Transfer/AIPs are done).
27. Getting Help
● Participating in the community forum
○ Archivematica
https://groups.google.com/forum/#!forum/archivematica
● Documentation
○ Main docs https://www.archivematica.org/en/
○ Wiki https://wiki.archivematica.org/Main_Page
● Github issues
○ Main repo https://github.com/archivematica/Issues/issues
28. See also
This presentation in document form
● For tech-savvy preservationists:
https://docs.google.com/document/d/1GybyH7X_gpZ7wpYVo5d9__LeG
NuXYCky0oairJGJAmo/edit#heading=h.y1nyq0vlcvsl
● For Archivematica-unfamiliar systems administrators:
https://docs.google.com/document/d/1NDzGHBGuPFa7GTHCMEl3D2n
vvdZRxG2FpdsGAYoG31I/edit#