Michal Malohlava discusses the magic behind the math - exposing the way that open source big data analysis H2O uses Scala to get work done, and demos how users can interact with Scala to get the most out of data analysis.
Presentation given at the International Conference on
Application and Theory of Petri Nets and Concurrency 2014, in Tunis, Tunisia. You can find the paper manuscript at http://edmundo.lopezbobeda.net/publications .
Discuss seven functions, Analysis of algorithms- Experimental Studies/Primitive operations/Asymptotic notation- Big Oh/Big-Omega/Big-Theta
(Download is recommended to make the animations work)
In den letzten Jahren hat sich die Software RRDtool zu einem unverzichtbaren Werkzeug in den Bereichen Monitoring und System Management entwickelt.
In dieser Präsentation erhalten Sie eine kurze Einführung in die Funktionsweise von RRDtool. Vor allem wird auf die neuen Funktionen in RRDtool 1.5 eingegangen und ein Ausblick auf die weitere Entwicklung der Software gegeben.
Connected hubs: an analysis of the Lufthansa network in EuropeSau Yee Chan
Two beginners in Python analysed network data from the Lufthansa Open API and constructed web applications showcasing these results using Python, SQL and D3.js.
Introduction to Functional Programming with Haskell and JavaScriptWill Kurt
Presentation give to NNSDG on 8/26/2010
Functional programming is often seen as either difficult and academic. Languages such as Haskell, while incredible powerful, don't do much to dismiss this claim. However anyone that's used JavaScript has written a lambda function, and a pretty impressive amount of purely functional code can be implemented in JavaScript. Using these to languages this presentation introduces the fundamentals of functional programming.
Presentation given at the International Conference on
Application and Theory of Petri Nets and Concurrency 2014, in Tunis, Tunisia. You can find the paper manuscript at http://edmundo.lopezbobeda.net/publications .
Discuss seven functions, Analysis of algorithms- Experimental Studies/Primitive operations/Asymptotic notation- Big Oh/Big-Omega/Big-Theta
(Download is recommended to make the animations work)
In den letzten Jahren hat sich die Software RRDtool zu einem unverzichtbaren Werkzeug in den Bereichen Monitoring und System Management entwickelt.
In dieser Präsentation erhalten Sie eine kurze Einführung in die Funktionsweise von RRDtool. Vor allem wird auf die neuen Funktionen in RRDtool 1.5 eingegangen und ein Ausblick auf die weitere Entwicklung der Software gegeben.
Connected hubs: an analysis of the Lufthansa network in EuropeSau Yee Chan
Two beginners in Python analysed network data from the Lufthansa Open API and constructed web applications showcasing these results using Python, SQL and D3.js.
Introduction to Functional Programming with Haskell and JavaScriptWill Kurt
Presentation give to NNSDG on 8/26/2010
Functional programming is often seen as either difficult and academic. Languages such as Haskell, while incredible powerful, don't do much to dismiss this claim. However anyone that's used JavaScript has written a lambda function, and a pretty impressive amount of purely functional code can be implemented in JavaScript. Using these to languages this presentation introduces the fundamentals of functional programming.
talk at Virginia Bioinformatics Institute, December 5, 2013ericupnorth
Extensible domain-specific programming for the sciences
The notion of scientists as programmers begs the question of what sort of programming language would be a good fit. The common answer seems to be both none of them and all of them. Many scientific applications are a combination of general-purpose and domain-specific languages: R for statistical elements, MATLAB for matrix-based computations, Perl-based regular expressions for string matching, C or FORTRAN for high performance parallel computations, and scripting languages such as Python to glue them all together. This clumsy situation demonstrates the need for different domain-specific language features.
Our hypothesis is that programming could be made easier, less error-prone and result in higher-quality code if languages could be easily extended, by the programmer, with the domain-specific features that a programmer or scientists needs for their particular task at hand. This talk demonstrates the meta-language processing tools that support this composition of programmer-selected language features, with several extensions chosen from the previously mentioned list of features.
Being functional in PHP (PHPDay Italy 2016)David de Boer
Functional programming, though far from new, has gained much traction recently. Functional programming characteristics have started to appear in the PHP world, too. Microframeworks such as Silex and Slim, middleware architectures such as Stack and even standards such as PSR-7 rely on concepts such as lambdas, referential transparency and immutability, all of which come from functional programming. I’ll give you a crash course in Erlang, a pragmatic functional language to make you feel familiar with the functional paradigm. By comparing code samples between Erlang and PHP, you’ll find out how you can employ functional programming in your PHP applications where appropriate. You’ll see that functional programming is nothing to be scared of. On the contrary, understanding its concepts broadens your programming horizon and provides you with valuable solutions to your problems.
A Multidimensional Distributed Array Abstraction for PGAS (HPCC'16)Menlo Systems GmbH
DASH is a realization of the PGAS (partitioned global address space) model in the form of a C++ template library without the need for a custom PGAS (pre-)compiler.
We present the DASH NArray concept, a multidimensional array abstraction designed as an underlying container for stenciland dense numerical applications.
After introducing fundamental programming concepts used in DASH, we explain how these have been extended by multidimensional capabilities in the NArray abstraction.
Focusing on matrix-matrix multiplication in a case study, we then discuss an implementation of the SUMMA algorithm for dense matrix multiplication to demonstrate how the DASH NArray facilitates portable efficiency and simplifies the design of efficient algorithms due to its explicit support for locality-based operations.
Finally, we evaluate the performance of the SUMMA algorithm based on the NArray abstraction against established implementations of DGEMM and PDGEMM.
In combination with mechanisms for automatic optimization of logical process topology and domain decomposition, our implementation yields highly competitive results without manual tuning, significantly outperforming Intel MKL and PLASMA in node-level use cases as well as ScaLAPACK in highly distributed scenarios.
Gary Bernhardt’s famous WAT talk pokes fun at the weird things in Ruby and JavaScript due to weak typing and operator overloading. But Go can be strange, too. It has its own odd behaviors, some of which we run into every day. Learning about Go’s corner cases teaches us how Go works under the covers.
User Defined Aggregation in Apache Spark: A Love StoryDatabricks
Defining customized scalable aggregation logic is one of Apache Spark’s most powerful features. User Defined Aggregate Functions (UDAF) are a flexible mechanism for extending both Spark data frames and Structured Streaming with new functionality ranging from specialized summary techniques to building blocks for exploratory data analysis.
This presentation is an introduction to Complex Event Processing (CEP) intended for an practicioners of Runtime Verification. It first describes typical CEP problems, popular tools and their query languages. It then presents BeepBeep 3, an event stream processor that attempts to bridge the gap between RV and CEP. Thanks to BeepBeep’s generic architecture and flexible input language, queries and properties from both fields can be efficiently processed.
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Iterative Spark Developmen...Data Con LA
This presentation will explore how Bloomberg uses Spark, with its formidable computational model for distributed, high-performance analytics, to take this process to the next level, and look into one of the innovative practices the team is currently developing to increase efficiency: the introduction of a logical signature for datasets.
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Sri Ambati
Sandeep Singh, Head of Applied AI Computer Vision, Beans.ai
H2O Open Source GenAI World SF 2023
In the modern era of machine learning, leveraging both open-source and closed-source solutions has become paramount for achieving cutting-edge results. This talk delves into the intricacies of seamlessly integrating open-source Large Language Model (LLM) solutions like Vicuna, Falcon, and Llama with industry giants such as ChatGPT and Google's Palm. As the demand for fine-tuned and specialized datasets grows, it is imperative to understand the synergy between these tools. Attendees will gain insights into best practices for building and enriching datasets tailored for fine-tuning tasks, ensuring that their LLM projects are both robust and efficient. Through real-world examples and hands-on demonstrations, this talk will equip attendees with the knowledge to harness the power of both open and closed-source tools in a coherent and effective manner.
Patrick Hall, Professor, AI Risk Management, The George Washington University
H2O Open Source GenAI World SF 2023
Language models are incredible engineering breakthroughs but require auditing and risk management before productization. These systems raise concerns about toxicity, transparency and reproducibility, intellectual property licensing and ownership, disinformation and misinformation, supply chains, and more. How can your organization leverage these new tools without taking on undue or unknown risks? While language models and associated risk management are in their infancy, a small number of best practices in governance and risk are starting to emerge. If you have a language model use case in mind, want to understand your risks, and do something about them, this presentation is for you!
Dr. Alexy Khrabrov, Open Source Science Community Director, IBM
H2O Open Source GenAI World SF 2023
In this talk, Dr. Alexy Khrabrov, recently elected Chair of the new Generative AI Commons at Linux Foundation for AI & Data, outlines the OSS AI landscape, challenges, and opportunities. With new models and frameworks being unveiled weekly, one thing remains constant: community building and validation of all aspects of AI is key to reliable and responsible AI we can use for business and society needs. Industrial AI is one key area where such community validation can prove invaluable.
Michelle Tanco, Head of Product, H2O.ai
H2O Open Source GenAI World SF 2023
Learn how the makers at H2O.ai are building internal tools to solve real use cases using H2O Wave and h2oGPT. We will walk through an end-to-end use case and discuss how to incorporate business rules and generated content to rapidly develop custom AI apps using only Python APIs.
Applied Gen AI for the Finance Vertical Sri Ambati
Megan Kurka, Vice President, Customer Data Scientist, H2O.ai
H2O Open Source GenAI World SF 2023
Discover the transformative power of Applied Gen AI. Learn how the H2O team builds customized applications and workflows that integrate capabilities of Gen AI and AutoML specifically designed to address and enhance financial use cases. Explore real world examples, learn best practices, and witness firsthand how our innovative solutions are reshaping the landscape of finance technology.
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Sri Ambati
Pascal Pfeiffer, Principal Data Scientist, H2O.ai
H2O Open Source GenAI World SF 2023
This talk dives into the expansive ecosystem of Large Language Models (LLMs), offering practitioners an insightful guide to various relevant applications, from natural language understanding to creative content generation. While exploring use cases across different industries, it also honestly addresses the current limitations of LLMs and anticipates future advancements.
Introducción al Aprendizaje Automatico con H2O-3 (1)Sri Ambati
En esta reunión virtual, damos una introducción a la plataforma de aprendizaje automático de código abierto número 1, H2O-3 y te mostramos cómo puedes usarla para desarrollar modelos para resolver diferentes casos de uso.
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...Sri Ambati
Numerai is an open, crowd-sourced hedge fund powered by predictions from data scientists around the world. In return, participants are rewarded with weekly payouts in crypto.
In this talk, Joe will give an overview of the Numerai tournament based on his own experience. He will then explain how he automates the time-consuming tasks such as testing different modelling strategies, scoring new datasets, submitting predictions to Numerai as well as monitoring model performance with H2O Driverless AI and R.
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...Sri Ambati
In this session, you will learn about what you should do after you’ve taken an AI transformation baseline. Over the span of this session, we will discuss the next steps in moving toward AI readiness through alignment of talent and tools to drive successful adoption and continuous use within an organization.
To find additional videos on AI courses, earn badges, join the courses at H2O.ai Learning Center: https://training.h2o.ai/products/ai-foundations-course
To find the Youtube video about this presentation: https://youtu.be/K1Cl3x3rd8g
Speaker:
Chemere Davis (H2O.ai - Senior Data Scientist Training Specialist)
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
2. Why Scala & H2O ?
●
H 2 O ~ fa s t, d is trib u te d , la rg e s c a le c o m p
p la tfo rm p ro v id in g ric h J a v a A P I
–
B u t lo w -le v e l a n d fo r m a n y u s e r s t o o c o m p li c a t e d
public class ShuffleTask extends MRTask2<ShuffleTask> {
@Override public void map(Chunk ic, Chunk oc) {
if (ic._len==0) return;
// Each vector is shuffled in the same way
Random rng = Utils.getRNG(0xe031e74f321f7e29L + (ic.cidx() << 32L));
oc.set0(0,ic.at0(0));
for (int row=1; row<ic._len; row++) {
int j = rng.nextInt(row+1); // inclusive upper bound <0,row>
if (j!=row) oc.set0(row, oc.at0(j));
oc.set0(j, ic.at0(row));
}
}
}
3. What we provides
●
ScAlH2O - Scala library providing a DSL
–
–
Easy data manipulation and distributed computation
–
●
Abstracting of H2O low-level API
BUT still inside JVM
Scala REPL integration into H2O
–
Console for experimenting with ScAlH2O
5. Frame operations
●
Parse data
●
Basic slicing
–
val f = parse("smalldata/cars.csv")
val f1 = f("name") ++ f(*, 5 to 7)
Column/Rows selectors, append
●
Scalar operations
●
Support head/tail/ncols/nrows/...
●
Cooperation with H2O distributed KV store
–
Load/save operations
val f2 = f1("year") + 1900
val g = load("cars.hex")
val g1 = g ++ g("year") > 80
save("cars.hex", g1)
6. Map/filter/collect operations
●
M ap
–
P e r v a lu e /r o w
●
F ilte r
●
C o lle c t
// Returns a boolean vector
val fm = f map ( new FAOp {
def apply(rhs: Array[scala.Double]):Boolean = rhs(2) > 4;
});
// Collect all cars with more than 4 cylinders
val ff = f filter ( new FAOp {
def apply(rhs: Array[scala.Double]):Boolean = rhs(2) > 4;
});
// Compute sum of 2. column
val fc = f collect ( 0.0, new CDOp() {
def apply(acc:scala.Double rhs:Array[scala.Double]) =
acc + rhs(2)
def reduce(l:scala.Double,r:scala.Double) = l+r
} )
8. Internals
●
No magic, BUT there are key-tricks
–
connect H2O classloaders with Scala ecosystem
●
–
Make sure that all distrib. objects are correctly iced
make translation of Scala code into calls of Java API
●
●
Pass operations around t he cloud
●
–
Create H2O MR tasks
Create new frames
preserve primitives types
●
do not introduce overhead of boxing/unboxing
9. Internals – translation to H2O MR tasks
// Collect all cars with more than 4 cylinders
val f5 = f filter ( new FAOp {
def apply(rhs: Array[scala.Double]):Boolean = rhs(2) > 4;
});
T_A2B_Transf has to be water.Freezable
def filter(af: T_A2B_Transf[scala.Double]):T = {
val f = frame()
val mrt = new MRTask2() {
override def map(in:Array[Chunk], out:Array[NewChunk]) = {
val rlen = in(0)._len
val tmprow = new Array[scala.Double](in.length)
for (row:Int <- 0 until rlen ) {
if (af(Utils.readRow(in,row,tmprow))) {
for (i:Int <- 0 until in.length) out(i).addNum(tmprow(i))
}
}
}
}
mrt.doAll(f.numCols(), f)
val result = mrt.outputFrame(f.names(), f.domains())
apply(result) // return the DFrame
}
11. Towards Scalding-like API
●
V is io n is to p ro v id e S c a ld in g -lik e s y n ta x
Input scheme
Output scheme
f map ( ('name, 'cylinders) -> ('name, 'moreThan4) )
{ (n:String, c:Int) => (n, if (c>4) 1 else 0) }
●
B u t s o fa r D S L is s till u g ly
f map (f, ('name, 'cylinders) -> ('name, 'moreThan4) )
{ new IcedFunctor2to2[Double,Int,Double,Int] {
def apply(n:Double, c:Int) = (n, if (c>4) 1 else 0) }
}
Transformation
12. Try and contribute !
> git clone git@github.com:0xdata/h2o.git
> git checkout -b h2oscala origin/h2oscala
> cd h2o-scala && ./depl.sh # or sbt compile
=== Welcome to the world of ScAlH2O ===
Type `help` or `example` to begin...
h2o>
Thank you!