The document discusses new features in version 0.9.4 of the DivConq file transfer software, including file tasks that can be triggered by uploads, scheduling, or file system events. It introduces dcScript, the scripting language that allows users to string together various file operations and tasks. Key points include that dcScript scripts can run asynchronously, optimize file operations through in-memory streaming rather than disk reads/writes, and offer features to simplify complex multi-step file tasks. The document provides examples of using dcScript to encrypt, compress, split and transfer files with just a few lines of code.
Tips from Support: Always Carry a Towel and Don’t Panic!Perforce
What should you do if you think you’ve got a problem with Perforce Helix? In this session, understand what the common issues are, what to look for, where to find help and how our Support engineers can assist you.
Tips from Support: Always Carry a Towel and Don’t Panic!Perforce
What should you do if you think you’ve got a problem with Perforce Helix? In this session, understand what the common issues are, what to look for, where to find help and how our Support engineers can assist you.
In this session we introduce administrators to the concepts of Docker and discuss architectural decisions that will come into play when deploying containers. Although this session was originally presented as part of IBM's New Way To Learn initiative it does not discuss any specific aspects of IBM technology
Apache Jackrabbit Oak is a new JCR implementation with a completely new architecture. Based on concepts like eventual consistency and multi-version concurrency control, and borrowing ideas from distributed version control systems and cloud-scale databases, the Oak architecture is a major leap ahead for Jackrabbit. This presentation describes the Oak architecture and shows what it means for the scalability and performance of modern content applications. Changes to existing Jackrabbit functionality are described and the migration process is explained.
This is an introductory tutorial to Apache Spark at the Lagos Scala Meetup II. We discussed the basics of processing engine, Spark, how it relates to Hadoop MapReduce. Little handson at the end of the session.
Spark Summit EU 2015: Lessons from 300+ production usersDatabricks
At Databricks, we have a unique view into over a hundred different companies trying out Spark for development and production use-cases, from their support tickets and forum posts. Having seen so many different workflows and applications, some discernible patterns emerge when looking at common performance and scalability issues that our users run into. This talk will discuss some of these common common issues from an engineering and operations perspective, describing solutions and clarifying misconceptions.
In this session we introduce administrators to the concepts of Docker and discuss architectural decisions that will come into play when deploying containers. Although this session was originally presented as part of IBM's New Way To Learn initiative it does not discuss any specific aspects of IBM technology
Apache Jackrabbit Oak is a new JCR implementation with a completely new architecture. Based on concepts like eventual consistency and multi-version concurrency control, and borrowing ideas from distributed version control systems and cloud-scale databases, the Oak architecture is a major leap ahead for Jackrabbit. This presentation describes the Oak architecture and shows what it means for the scalability and performance of modern content applications. Changes to existing Jackrabbit functionality are described and the migration process is explained.
This is an introductory tutorial to Apache Spark at the Lagos Scala Meetup II. We discussed the basics of processing engine, Spark, how it relates to Hadoop MapReduce. Little handson at the end of the session.
Spark Summit EU 2015: Lessons from 300+ production usersDatabricks
At Databricks, we have a unique view into over a hundred different companies trying out Spark for development and production use-cases, from their support tickets and forum posts. Having seen so many different workflows and applications, some discernible patterns emerge when looking at common performance and scalability issues that our users run into. This talk will discuss some of these common common issues from an engineering and operations perspective, describing solutions and clarifying misconceptions.
AWS에서는 Big Data 분석 및 처리를 위해 다양한 Analytics 서비스를 지원합니다. 이 세션에서는 시간이 지날수록 증가하는 데이터 분석 및 처리를 위해 데이터 레이크 카탈로그를 구축하거나 ETL을 위해 사용되는 AWS Glue 내부 구조를 살펴보고 효율적으로 사용할 수 있는 방법들을 소개합니다.
Your data is getting bigger while your boss is getting anxious to have insights! This tutorial covers Apache Spark that makes data analytics fast to write and fast to run. Tackle big datasets quickly through a simple API in Python, and learn one programming paradigm in order to deploy interactive, batch, and streaming applications while connecting to data sources incl. HDFS, Hive, JSON, and S3.
The DrupalCampLA 2011 presentation on backend performance. The slides go over optimizations that can be done through the LAMP (or now VAN LAMMP stack for even more performance) to get everything up and running.
Apache Spark Introduction and Resilient Distributed Dataset basics and deep diveSachin Aggarwal
We will give a detailed introduction to Apache Spark and why and how Spark can change the analytics world. Apache Spark's memory abstraction is RDD (Resilient Distributed DataSet). One of the key reason why Apache Spark is so different is because of the introduction of RDD. You cannot do anything in Apache Spark without knowing about RDDs. We will give a high level introduction to RDD and in the second half we will have a deep dive into RDDs.
Dealing with large files efficiently is a necessity for every business. Choosing a cloud-based file-hosting service that allows your employees to upload, download, and share large files with less lag time means they can get back to their projects sooner and stay productive. Lower bandwidth consumption during large file sharing also helps prevent lost productivity or workflow interruptions. To regain valuable work time and reduce the workflow hassles that come from transferring large files, consider Dropbox Business for your cloud-based file-hosting needs.
Introduction to Apache Spark. With an emphasis on the RDD API, Spark SQL (DataFrame and Dataset API) and Spark Streaming.
Presented at the Desert Code Camp:
http://oct2016.desertcodecamp.com/sessions/all
Introduction to Apache Spark. With an emphasis on the RDD API, Spark SQL (DataFrame and Dataset API) and Spark Streaming.
Presented at the Desert Code Camp:
http://oct2016.desertcodecamp.com/sessions/all
From Zero to Hadoop: a tutorial for getting started writing Hadoop jobs on Am...Alexander Dean
Hadoop is everywhere these days, but it can seem like a complex, intimidating ecosystem to those who have yet to jump in.
In this hands-on workshop, Alex Dean, co-founder of Snowplow Analytics, will take you "from zero to Hadoop", showing you how to run a variety of simple (but powerful) Hadoop jobs on Elastic MapReduce, Amazon's hosted Hadoop service. Alex will start with a no-nonsense overview of what Hadoop is, explaining its strengths and weaknesses and why it's such a powerful platform for data warehouse practitioners. Then Alex will help get you setup with EMR and Amazon S3, before leading you through a very simple job in Pig, a simple language for writing Hadoop jobs. After this we will move onto writing a more advanced job in Scalding, Twitter's Scala API for writing Hadoop jobs. For our final job, we will consolidate everything we have learnt by building a more sophisticated job in Scalding.
DCEU 18: Developing with Docker ContainersDocker, Inc.
Laura Frank Tacho - Director of Engineering, CloudBees
Wouldn't it be great for a new developer on your team to have their dev environment totally set up on their first day? What about having the confidence that your dev environment mirrors testing and prod? Containers enable this to become reality, along with other great benefits like keeping dependencies nice and tidy and making packaged code easier to share. Come learn about the ways containers can help you build and ship software easily, and walk away with two actionable steps you can take to start using Docker containers for development.
Powerful big data processing and storage combined, this presentation walks thru the basics of integrating Apache Spark and Apache Cassandra. Presented by Alex Thompson at the Sydney Cassandra Meetup.
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
Accelerate Enterprise Software Engineering with PlatformlessWSO2
Key takeaways:
Challenges of building platforms and the benefits of platformless.
Key principles of platformless, including API-first, cloud-native middleware, platform engineering, and developer experience.
How Choreo enables the platformless experience.
How key concepts like application architecture, domain-driven design, zero trust, and cell-based architecture are inherently a part of Choreo.
Demo of an end-to-end app built and deployed on Choreo.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
We explore the Globus Connect Server (GCS) architecture and experiment with advanced configuration options and use cases. This content is targeted at system administrators who are familiar with GCS and currently operate—or are planning to operate—broader deployments at their institution.
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Anthony Dahanne
Les Buildpacks existent depuis plus de 10 ans ! D’abord, ils étaient utilisés pour détecter et construire une application avant de la déployer sur certains PaaS. Ensuite, nous avons pu créer des images Docker (OCI) avec leur dernière génération, les Cloud Native Buildpacks (CNCF en incubation). Sont-ils une bonne alternative au Dockerfile ? Que sont les buildpacks Paketo ? Quelles communautés les soutiennent et comment ?
Venez le découvrir lors de cette session ignite
First Steps with Globus Compute Multi-User EndpointsGlobus
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
Software Engineering, Software Consulting, Tech Lead.
Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Security,
Spring Transaction, Spring MVC,
Log4j, REST/SOAP WEB-SERVICES.
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
top nidhi software solution freedownloadvrstrong314
This presentation emphasizes the importance of data security and legal compliance for Nidhi companies in India. It highlights how online Nidhi software solutions, like Vector Nidhi Software, offer advanced features tailored to these needs. Key aspects include encryption, access controls, and audit trails to ensure data security. The software complies with regulatory guidelines from the MCA and RBI and adheres to Nidhi Rules, 2014. With customizable, user-friendly interfaces and real-time features, these Nidhi software solutions enhance efficiency, support growth, and provide exceptional member services. The presentation concludes with contact information for further inquiries.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
Even though at surface level ‘java.lang.OutOfMemoryError’ appears as one single error; underlyingly there are 9 types of OutOfMemoryError. Each type of OutOfMemoryError has different causes, diagnosis approaches and solutions. This session equips you with the knowledge, tools, and techniques needed to troubleshoot and conquer OutOfMemoryError in all its forms, ensuring smoother, more efficient Java applications.
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfJay Das
With the advent of artificial intelligence or AI tools, project management processes are undergoing a transformative shift. By using tools like ChatGPT, and Bard organizations can empower their leaders and managers to plan, execute, and monitor projects more effectively.
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxrickgrimesss22
Discover the essential features to incorporate in your Winzo clone app to boost business growth, enhance user engagement, and drive revenue. Learn how to create a compelling gaming experience that stands out in the competitive market.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
2. Version 0.9.4 of DivConq MFT is a “concept” release to highlight some of the
plans we have. Not all the concepts are fully developed or fully tested, this is Beta
quality, as will be all 0.9.N releases.
With 0.9.4 comes:
• File Task (script) triggered on successful upload
• File Task (script) scheduler
• File Task (script) triggered by file system events (Windows, Linux and OS X)
At the heart is dcScript, the scripting language of DivConq.
3. It is the “glue” that brings various file (and related) operations together to support
a specific business goal.
It is not optimized for scientific number crunching, but is fast enough for file
transfer tasks.
Scripts are meant to be small, calling supporting Java classes. Java does the
work, Scripts declare the intent behind the work.
XML is an ideal grammar for declaring intent, dcScripts are written in XML.
dcScript is designed so that each instruction may be executed asynchronously.
This is essential to DivConq because our threading model is optimized around all
operations being async. Fortunately dcScript takes the complexity out of async
programming, see next slide...
4. What does it mean that dcScript takes the complexity out of async programming?
Review this example:
<dcScript>
<Main>
<Console>Before
Sleep</Console>
<Sleep
Seconds="3"
/>
<Console>After
Sleep</Console>
</Main>
</dcScript>
The Sleep instruction, like many other instructions in dcScript, supports async
operations. So when this code is run the “Before” message appears and three
seconds later the “After” message appears. However, the thread does not
block, rather it does other work while the script does nothing for three seconds.
There is no need for a Callback, a Future or an Await, dcScript picks up right after
sleep automatically as if this were sync code – but it is not.
5. dcScript benefits DivConq by complementing our threading model. How does it
benefit you?
Number one reason is it puts complex file operations at your finger tips. For
example, lets say you have a file operation requirement:
• Starting with a folder of files in /Work/Temp/Source/karabiner
• You need to tar those files into a tar ball
• Then gzip the tar ball
• Then PGP Encrypt the gzip tar ball
• Then split that encrypted file into 512MB chunks, named karabiner_N.tgz.gpg
where N is the number sequence and then store those split files in /Work/
Temp/Dest/karabiner
6. This is it. Source and Dest folder. Then a list of operations, which automatically
form a chain – Tar, Gzip, PGPEncrypt, Split.
<dcScript>
<Main>
<LocalFolder
Name="fh0"
Path="/Work/Temp/Source/karabiner"
/>
<LocalFolder
Name="fh2"
Path="/Work/Temp/Dest/karabiner"
/>
<FileOps>
<Tar
Source="$fh0"
/>
<Gzip
/>
<PGPEncrypt
Keyring="/Work/Keys/pubring.gpg"
/>
<Split
Dest="$fh2"
Size="512MB"
Template="karabiner_%seq%.tgz.gpg"
/>
</FileOps>
</Main>
</dcScript>
Not only is this easy to write and easy to understand, these operations have all
been optimized for you. How? Through in-memory streaming…
7. Sources
Local
Op
1
Op
2
Op
3
Op
4
Local
DestinationsOperations (Tar, Gzip, PGP, etc)
Typically these operations would require 4 separate reads and writes to disk,
disk (even SSD) however is very slow compared to memory. This is particularly
true when attempting to scale your server. What if you had 4 threads
processing files? Then you’d need to read/write 16 files. Overall performance
goes down. With streaming you reduce the disk access considerably saving
your server I/O for use with scaling. And maybe reducing IaaS costs too, when
using a service such as Amazon EC2 – I/O Ops can be expensive.
Read Disk
Write Disk
In-Memory
Transformations
8. Sources
Local
Copy
Local
DestinationsOperation
How does Copy normally work in Java? Your code (blue) reads a single buffer of
data from the disk – probably around 32KB. Then the code writes the buffer to
the new file on disk. This loop is repeated until there is no more data to read.
DivConq streaming expands on this principle.
Read Disk
Write Disk
9. Sources
Local
Op
1
Op
2
Op
3
Op
4
Local
DestinationsOperations (Tar, Gzip, PGP, etc)
DivConq Streams only start when you supply a Destination. Op4 is the target of
the Destination, so work begins at Op4.
Start Stream with Dest
Request for Content
Bubbles up Stream
10. Sources
Local
Op
1
Op
2
Op
3
Op
4
Local
DestinationsOperations (Tar, Gzip, PGP, etc)
The file is read just a any normal file read would in Java.
Call
Read
File
Buffer
Returned
11. Sources
Local
Op
1
Op
2
Op
3
Op
4
Local
DestinationsOperations (Tar, Gzip, PGP, etc)
Buffer Transformed and Relayed by each Op
12. Sources
Local
Op
1
Op
2
Op
3
Op
4
Local
DestinationsOperations (Tar, Gzip, PGP, etc)
The buffer is written to a file after the final transformation.
Write Disk
13. Sources
Local
Op
1
Op
2
Op
3
Op
4
Local
DestinationsOperations (Tar, Gzip, PGP, etc)
Op 4 requests more content. Repeat until Op 1 reaches end of source file.
Request for Content
Bubbles up Stream
14. Transformations sometimes require more data than a single buffer may hold. In
these cases the Op may issue a request for content up stream before sending
the transformed buffer down stream. Any Op may issue any number of upstream
requests before passing data downstream.
Any Op may issue any number of async commands before pulling from up
stream or pushing to down stream. Yet the stream starts right where it left off
after the async request returns.
Data sources may be read or written with async operations, without interrupting
the stream’s normal operation.
Streams are stack safe – you will not experience Stack Overflow using DivConq
streams.
Streams are automatically throttled, no single stream will overload your CPU.
Streams are fast, only slightly slower than a normal Java file copy.
15. Sources
Local
Op
1
Op
2
Op
3
Op
4
Local
DestinationsOperations (Tar, Gzip, PGP, etc)
In memory, not Temp files, when
possible.
SFTP
S3
HDFS
Web
DAV
SFTP
S3
HDFS
Web
DAV
SMB
SMB
In the future DivConq will
support many protocols for
source and destination.
16. Sources
Local
Op
1
Op
2
Op
3
Op
4
Local
DestinationsOperations (Tar, Gzip, PGP, etc)
In memory, not Temp files, when
possible.
SFTP
S3
HDFS
Web
DAV
SFTP
S3
HDFS
Web
DAV
SMB
SMB
Mix and match source and
destination – for example pull from
SFTP and write to S3.
No need to write to local disk at all.
17. Streaming is a powerful feature. dcScript makes it easy to use this feature as
well as other File Task operations such a file selection, file filtering and file
ordering. And of course it will support the basics such as moving files, deleting
files, renaming files, transforming file formats, manipulating zip files, joining files,
hashing files, and verifying OpenPGP signatures.
dcScript will offer significant support for reading and writing text files (line by line)
right in script. We will also add support for other common formats such as XML,
JSON, Excel and CSV.
We will continue to optimize dcScript’s operations so that you get the best
possible performance without worrying about tuning. dcScript brings worry free:
• Threading
• Streaming
• Optimizations
18. dcScript offers a full set of scripting instructions:
• For, While, Until Loops
• If/Else Conditions
• Select/Case
• Function Calls
• Function Libraries
However, this does not mean that dcScript should be used to write big programs. It
is a glue language and designed for simplicity. You will be able to (future) create
extensions to dcScript in Java or Groovy – use these extensions to do heavy work
and use dcScript to glue the pieces together.
Also note that the following introduction is not intended to teach computer
programming. You should already have experience with JavaScript or similar
language and be familiar with variables, for loops, as well as if and else conditions.
19. Many feature of dcScript are ready for you to try. Be sure to get the DivConq 0.9.4
demo release: https://github.com/Gadreel/divconq/wiki/Getting Started
Running the dcScript demos is best done with a GUI so please do this in an OS
with a GUI environment such as Windows, OS X or a GUI Linux desktop…not in
headless Amazon Linux (or similar)
You may try dcScript within either dcFileServer or dcFileTasks. Run your server in
Foreground mode as detailed in the instructions. Select option 100 (“dcScript GUI
Debugger”).
In the GUI window that appears select File and Open. Navigate to “packages/
dcTest/dcs/examples/01-variables.dcs.xml”.
20. You now have a script that looks like this only with more comments in it.
<dcScript
Title="Variable
Basics">
<Main>
<Var
Name="bottles"
Type="Integer"
SetTo="99"
/>
<Console>bottles
=
{$bottles}</Console>
<With
Target="$bottles">
<Inc
/>
</With>
<Console>$bottles</Console>
...
Use the double arrow button (first button) to the right to step into the script. Step
until you get to the first Console instruction. Notice that the line of the script you
are about to execute is highlighted. Select <Main> in the box labeled Stack. Note
now that you can see the values for all the variables in the Main scope. Step
some more and as you do notice the values change for the variables.
21. There are 10 example scripts available for you to learn the basics of dcScript. Use
the GUI debugger and read the comments to learn our scripting engine.
Follow the instructions for the dcFileTasks demo (Test Tasks) at:
https://github.com/Gadreel/divconq/wiki/Getting-Started-Config#config-for-tasks to
see scripts integrated into the server’s features. Review also the scripts at
packages/dcTest/dcs/test/06-scheduler.dcs.xml and packages/dcTest/dcs/test/07-
watch-file.dcs.xml
Try creating your own script that runs on a schedule or when a file is modified.
See notes in the wiki link above about where to add your scripts.
22. <LocalFolder
Name="fs1"
Path="c:/temp/test"
/>
<LocalFolder
Name="fs3"
Path="c:/temp/testtar"
/>
<FileOps>
<Tar
Source="$fs1"
NameHint="test-‐files-‐ball"
/>
<Gzip
Dest="$fs3"
/>
<FileOps>
Give streams a try with something like the above – take a source folder and Tar
and the Gzip it to a new destination.
Reverse it with Ungzip and Untar operations.
Try an encrypt like this file op:
<PGPEncrypt
Source="$fh0"
Dest="$fh1"
Keyring="/Users/Owner/.gnupg/pubring.gpg"
/>
And maybe try a Tar and an PGPEncrypt. PGPDecrypt is not supported yet, but
there is still enough to play with. dcScript and DivConq are off to a promising
start.