The document summarizes Tiark Rompf's talk on using the Delite framework to build domain-specific languages (DSLs) that can be optimized and compiled to different low-level architectures. It provides examples of existing DSLs created with Delite for machine learning, data querying, graph analysis, and collections. The talk discussed how DSLs allow writing programs at a high-level that can then be optimized and generated into high-performance code.
Two further methods for obtaining post-quantum security are discussed, namely code-based and isogeny-based cryptography. Topic 1: Revocable Identity-based Encryption from Codes with Rank Metric (will be presented by Dr. Reza Azarderakhsh) Authors: Donghoon Chang; Amit Kumar Chauhan; Sandeep Kumar; Somitra Kumar Sanadhya Topic 2: An Exposure Model for Supersingular Isogeny Diffie-Hellman Key Exchange Authors: Brian Koziel; Reza Azarderakhsh; David Jao
(Source: RSA Conference USA 2018)
This presentation begins with explaining the basic algorithms of machine learning and using the same concepts, discusses in detail 2 supervised learning/deep learning algorithms - Artificial neural nets and Convolutional Neural Nets. The relationship between Artificial neural nets and basic machine learning algorithms such as logistic regression and soft max is also explored. For hands on the implementation of ANN's and CNN's on MNIST dataset is also explained.
Accelerating Habanero-Java Program with OpenCL GenerationAkihiro Hayashi
Accelerating Habanero-Java Program with OpenCL Generation. Akihiro Hayashi, Max Grossman, Jisheng Zhao, Jun Shirako, Vivek Sarkar. 10th International Conference on the Principles and Practice of Programming in Java (PPPJ), September 2013.
Two further methods for obtaining post-quantum security are discussed, namely code-based and isogeny-based cryptography. Topic 1: Revocable Identity-based Encryption from Codes with Rank Metric (will be presented by Dr. Reza Azarderakhsh) Authors: Donghoon Chang; Amit Kumar Chauhan; Sandeep Kumar; Somitra Kumar Sanadhya Topic 2: An Exposure Model for Supersingular Isogeny Diffie-Hellman Key Exchange Authors: Brian Koziel; Reza Azarderakhsh; David Jao
(Source: RSA Conference USA 2018)
This presentation begins with explaining the basic algorithms of machine learning and using the same concepts, discusses in detail 2 supervised learning/deep learning algorithms - Artificial neural nets and Convolutional Neural Nets. The relationship between Artificial neural nets and basic machine learning algorithms such as logistic regression and soft max is also explored. For hands on the implementation of ANN's and CNN's on MNIST dataset is also explained.
Accelerating Habanero-Java Program with OpenCL GenerationAkihiro Hayashi
Accelerating Habanero-Java Program with OpenCL Generation. Akihiro Hayashi, Max Grossman, Jisheng Zhao, Jun Shirako, Vivek Sarkar. 10th International Conference on the Principles and Practice of Programming in Java (PPPJ), September 2013.
Reactive Qt - Ivan Čukić (Qt World Summit 2015)Ivan Čukić
Reactive programming is an emerging discipline which achieves concurrency using events-based programming. Today, It is mostly used for writing very scalable web services that can achieve high concurrency levels even on a single thread.
The concept is simple - make a system that is fully event-based, and look at events not as isolated instances, but as streams. When we have streams, we can manipulate them as if they were simple ranges. We can filter them, modify them, combine multiple streams into one etc.
Reactive programming is not only applicable to the web services, it can be used in any event-based environment. In our case, in normal Qt applications, to enrich the power of signals and slots.
[...]
Presented at the Bossa'10 conference in Manaus, Brazil. The presentation talks about the direction in which the Qt widgets are being developed and introduces the idea of Controls to Qt and QML.
Highlighted notes of:
Introduction to CUDA C: NVIDIA
Author: Blaise Barney
From: GPU Clusters, Lawrence Livermore National Laboratory
https://computing.llnl.gov/tutorials/linux_clusters/gpu/NVIDIA.Introduction_to_CUDA_C.1.pdf
Blaise Barney is a research scientist at Lawrence Livermore National Laboratory.
Presented as a pre-conference tutorial at the GPU Technology Conference in San Jose on September 20, 2010.
Learn about NVIDIA's OpenGL 4.1 functionality available now on Fermi-based GPUs.
Presentation of NvFX: an effect layer that allows encapsulation of GLSL and/or D3D shading language.
The basic concept follows the footprints of NVIDIA CgFX
https://github.com/tlorach/nvFX
PyTorch is one of the most widely used deep learning library in python community. In this talk I will cover the basic to advanced guide to implement deep learning model using PyTorch. My goal is to introduce PyTorch and show how to use it for deep learning project.
Weather service, maps and navigation, photo viewer, instant messaging, web browser, flick list or kinetic scrolling. You want all these with Qt? You get it!
Presentation by Ariya Hidayat held during Qt Developer Days 2009.
http://qt.nokia.com/developer/learning/elearning
OpenGL NVIDIA Command-List: Approaching Zero Driver OverheadTristan Lorach
This presentation introduces a new NVIDIA extension called Command-list.
The purpose of this presentation is to explain the basic concepts on how to use it and show what are the benefits.
The sample I used for the talk is here: https://github.com/nvpro-samples/gl_commandlist_bk3d_models
The driver for trying should be PreRelease 347.09
http://www.nvidia.com/download/driverResults.aspx/80913/en-us
After an overview of Qt and its tools, a Hello World application quickly demonstrates the basic principles.
Qt is mainly famous for its intelligent concepts of signals and slots, which is explained together with examples for how to use widgets (UI controls).
At the end, the foundations of the meta-object system and its implications on memory management are explained.
This module follows up the introduction in the "Software Development with Qt" module, plus the Quickstart slides.
Graphics View becomes one of the prominent features of Qt these days, it also serves as the backbone for next-generation user-interface developments. This talk highlights several tips and tricks which you can employ to beautify your Graphics View-based application, in order to have much more exciting and interesting user interactions. In addition, a new addition in Qt 4.6, namely the graphics effect feature, will be introduced and demonstrated.
Presentation by Ariya Hidayat held during Qt Developer Days 2009.
http://qt.nokia.com/developer/learning/elearning
OpenGL 4.4 provides new features for accelerating scenes with many objects, which are typically found in professional visualization markets. This talk will provide details on the usage of the features and their effect on real-life models. Furthermore we will showcase how more work for rendering a scene can be off-loaded to the GPU, such as efficient occlusion culling or matrix calculations.
Video presentation here: http://on-demand.gputechconf.com/gtc/2014/video/S4379-opengl-44-scene-rendering-techniques.mp4
Reactive Qt - Ivan Čukić (Qt World Summit 2015)Ivan Čukić
Reactive programming is an emerging discipline which achieves concurrency using events-based programming. Today, It is mostly used for writing very scalable web services that can achieve high concurrency levels even on a single thread.
The concept is simple - make a system that is fully event-based, and look at events not as isolated instances, but as streams. When we have streams, we can manipulate them as if they were simple ranges. We can filter them, modify them, combine multiple streams into one etc.
Reactive programming is not only applicable to the web services, it can be used in any event-based environment. In our case, in normal Qt applications, to enrich the power of signals and slots.
[...]
Presented at the Bossa'10 conference in Manaus, Brazil. The presentation talks about the direction in which the Qt widgets are being developed and introduces the idea of Controls to Qt and QML.
Highlighted notes of:
Introduction to CUDA C: NVIDIA
Author: Blaise Barney
From: GPU Clusters, Lawrence Livermore National Laboratory
https://computing.llnl.gov/tutorials/linux_clusters/gpu/NVIDIA.Introduction_to_CUDA_C.1.pdf
Blaise Barney is a research scientist at Lawrence Livermore National Laboratory.
Presented as a pre-conference tutorial at the GPU Technology Conference in San Jose on September 20, 2010.
Learn about NVIDIA's OpenGL 4.1 functionality available now on Fermi-based GPUs.
Presentation of NvFX: an effect layer that allows encapsulation of GLSL and/or D3D shading language.
The basic concept follows the footprints of NVIDIA CgFX
https://github.com/tlorach/nvFX
PyTorch is one of the most widely used deep learning library in python community. In this talk I will cover the basic to advanced guide to implement deep learning model using PyTorch. My goal is to introduce PyTorch and show how to use it for deep learning project.
Weather service, maps and navigation, photo viewer, instant messaging, web browser, flick list or kinetic scrolling. You want all these with Qt? You get it!
Presentation by Ariya Hidayat held during Qt Developer Days 2009.
http://qt.nokia.com/developer/learning/elearning
OpenGL NVIDIA Command-List: Approaching Zero Driver OverheadTristan Lorach
This presentation introduces a new NVIDIA extension called Command-list.
The purpose of this presentation is to explain the basic concepts on how to use it and show what are the benefits.
The sample I used for the talk is here: https://github.com/nvpro-samples/gl_commandlist_bk3d_models
The driver for trying should be PreRelease 347.09
http://www.nvidia.com/download/driverResults.aspx/80913/en-us
After an overview of Qt and its tools, a Hello World application quickly demonstrates the basic principles.
Qt is mainly famous for its intelligent concepts of signals and slots, which is explained together with examples for how to use widgets (UI controls).
At the end, the foundations of the meta-object system and its implications on memory management are explained.
This module follows up the introduction in the "Software Development with Qt" module, plus the Quickstart slides.
Graphics View becomes one of the prominent features of Qt these days, it also serves as the backbone for next-generation user-interface developments. This talk highlights several tips and tricks which you can employ to beautify your Graphics View-based application, in order to have much more exciting and interesting user interactions. In addition, a new addition in Qt 4.6, namely the graphics effect feature, will be introduced and demonstrated.
Presentation by Ariya Hidayat held during Qt Developer Days 2009.
http://qt.nokia.com/developer/learning/elearning
OpenGL 4.4 provides new features for accelerating scenes with many objects, which are typically found in professional visualization markets. This talk will provide details on the usage of the features and their effect on real-life models. Furthermore we will showcase how more work for rendering a scene can be off-loaded to the GPU, such as efficient occlusion culling or matrix calculations.
Video presentation here: http://on-demand.gputechconf.com/gtc/2014/video/S4379-opengl-44-scene-rendering-techniques.mp4
Building High-Performance Language Implementations With Low EffortStefan Marr
This talk shows how languages can be implemented as self-optimizing interpreters, and how Truffle or RPython go about to just-in-time compile these interpreters to efficient native code.
Programming languages are never perfect, so people start building domain-specific languages to be able to solve their problems more easily. However, custom languages are often slow, or take enormous amounts of effort to be made fast by building custom compilers or virtual machines.
With the notion of self-optimizing interpreters, researchers proposed a way to implement languages easily and generate a JIT compiler from a simple interpreter. We explore the idea and experiment with it on top of RPython (of PyPy fame) with its meta-tracing JIT compiler, as well as Truffle, the JVM framework of Oracle Labs for self-optimizing interpreters.
In this talk, we show how a simple interpreter can reach the same order of magnitude of performance as the highly optimizing JVM for Java. We discuss the implementation on top of RPython as well as on top of Java with Truffle so that you can start right away, independent of whether you prefer the Python or JVM ecosystem.
While our own experiments focus on SOM, a little Smalltalk variant to keep things simple, other people have used this approach to improve peek performance of JRuby, or build languages such as JavaScript, R, and Python 3.
Spark Training Institutes: kelly technologies is the best Spark class Room training institutes in Bangalore. Providing Spark training by real time faculty in Bangalore.
Spark as the Gateway Drug to Typed Functional Programming: Spark Summit East ...Spark Summit
Something really exciting and largely unnoticed is going on in the Spark ecosystem. As data scientists and engineers learn Spark, they’re actually all implicitly learning a much older, more general topic: typed functional programming. While Spark itself was built on an accumulation of powerful computer science concepts from functional programming and other areas, developers are often encountering these ideas in the context of Spark for the first time. It turns out that Spark makes an excellent platform for learning concepts like immutability, higher order and anonymous functions, laziness, and monadic operators.
This talk will discuss how Spark can be used as teaching tool, to build skills in areas like typed functional programming. We’ll explore a skill-building curriculum that can be used with a data scientist or engineer who only has experience in imperative, dynamically-typed languages like Python. This curriculum introduces the core concepts of functional programming and type theory, while providing learners the opportunity to immediately apply their skills at massive scale, using the power of Spark’s painless scalability and resilience.
Based on the experience of building machine learning teams at x.ai and other data-centric startups, this curriculum is the foundation of building poly-skilled, highly autonomous team members who can build scalable intelligent systems. We’ll work from foundational concepts of Scala and functional programming towards a fully implemented machine learning pipeline, all using Spark and MLlib. Unique new features of Spark like Datasets and Structured Streaming will be particularly useful in this effort. Using this approach, teams can help members in all roles learn how to use sophisticated programming techniques that ensure correctness at scale. With these skills in their toolbox, data scientists and engineers often find that building powerful machine learning systems is intuitive, easy, and even fun.
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Iterative Spark Developmen...Data Con LA
This presentation will explore how Bloomberg uses Spark, with its formidable computational model for distributed, high-performance analytics, to take this process to the next level, and look into one of the innovative practices the team is currently developing to increase efficiency: the introduction of a logical signature for datasets.
Parallel Implementation of K Means Clustering on CUDAprithan
K-Means clustering is a popular clustering algorithm in data mining. Clustering large data sets can be
time consuming, and in an attempt to minimize this time, our project is a parallel implementation of KMeans
clustering algorithm on CUDA using C. We present the performance analysis and implementation
of our approach to parallelizing K-Means clustering.
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...AMD Developer Central
Presentation HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated Processing Units, by Robert Engel at the AMD Developer Summit (APU13) Nov. 11-13, 2013.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Arvindsujeeth scaladays12
1. Arvind K. Sujeeth, HyoukJoong Lee, Kevin J. Brown,
Hassan Chafi, Michael Wu, Victoria Popic, Kunle Olukotun
Stanford University
Pervasive Parallelism Laboratory (PPL)
Tiark Rompf, Aleksandar Prokopec, Vojin Jovanovic,
Philipp Haller, Martin Odersky
Ecole Polytechnique Federale de Lausanne (EPFL)
Programming Methods Laboratory (LAMP)
4. Pthreads Sun
OpenMP T2
CUDA Nvidia
OpenCL Fermi
Verilog Altera
VHDL FPGA
MPI
PGAS Cray
Jaguar
5. Applications
Pthreads Sun
Scientific OpenMP T2
Engineering
Virtual CUDA Nvidia
Worlds OpenCL Fermi
Personal
Robotics
Verilog Altera
VHDL FPGA
Data
Informatics
MPI
PGAS Cray
Jaguar
6. Applications
Pthreads Sun
Scientific OpenMP T2
Engineering
Virtual DSLs CUDA Nvidia
Worlds OpenCL Fermi
Personal
Robotics
Verilog Altera
VHDL FPGA
Data
Informatics
MPI
PGAS Cray
Jaguar
Too many different programming models
7. n Tiark Rompf’s talk yesterday
n In case you missed it:
n Techniques for rewriting high-level
programs to high-performance programs
n Build an intermediate representation (IR)
of Scala programs at runtime
n IR can be optimized and code generated
8. n Introduction to existing Delite DSLs
n Constructing your own Delite DSL
n Not covered – under the covers:
n Implementation details about the Delite
framework
n See http://cgo2012.hyperdsls.org/
9. n Syntax is legal Scala
A B A C
n Staged
to build an IR * *
(metaprogramming) +
n Optimized at a high level
n Compiled
to different low-level target
architectures
11. OptiML: An Implicitly Parallel Domain-Specific Language for
Machine Learning, ICML 2011
n Provides a familiar (MATLAB-like) language and
API for writing ML applications
n Ex. val
c
=
a
*
b
(a, b are Matrix[Double])
n Implicitly parallel data structures
n Base types: Vector[T], Matrix[T], Graph[V,E], Stream[T]
n Subtypes: TrainingSet, IndexVector, Image, …
n Implicitly parallel control structures
n sum{…}, (0::end) {…}, gradient { … }, untilconverged { … }
n Arguments to control structures are anonymous functions with
restricted semantics
12. untilconverged(mu,
tol){
mu
=>
//
calculate
distances
to
current
centroids
//
move
each
cluster
centroid
to
the
//
mean
of
the
points
assigned
to
it
}
13. untilconverged(mu,
tol){
mu
=>
//
calculate
distances
to
current
centroids
val
c
=
(0::m){i
=>
val
allDistances
=
mu
mapRows
{
centroid
=>
dist(x(i),
centroid)
}
allDistances.minIndex
}
//
move
each
cluster
centroid
to
the
//
mean
of
the
points
assigned
to
it
}
14. untilconverged(mu,
tol){
mu
=>
//
calculate
distances
to
current
centroids
val
c
=
(0::m){i
=>
val
allDistances
=
mu
mapRows
{
centroid
=>
dist(x(i),
centroid)
}
fused
allDistances.minIndex
}
//
move
each
cluster
centroid
to
the
//
mean
of
the
points
assigned
to
it
val
newMu
=
(0::k,*){
i
=>
val
(weightedpoints,
points)
=
sum(0,m)
{
j
=>
if
(c(i)
==
j)
(x(i),1)
}
val
d
=
if
(points
==
0)
1
else
points
weightedpoints
/
d
}
newMu
}
15. n Dataquerying of in-memory
collections
n inspired by LINQ
n SQL-like declarative language
n Use
high-level semantic knowledge to
implement query optimizer
16. //
lineItems:
Iterable[LineItem]
//
Similar
to
Q1
of
the
TPCH
benchmark
hoisted
val
q
=
lineItems
Where(_.l_shipdate
<=
Date(‘‘19981201’’)).
GroupBy(l
=>
(l.l_linestatus)).
Select(g
=>
new
Result
{
val
lineStatus
=
g.key
val
sumQty
=
g.Sum(_.l_quantity)
val
sumDiscountedPrice
=
g.Sum(r
=>
r.l_extendedprice*(1.0-‐r.l_discount))
fused
val
avgPrice
=
g.Average(_.l_extendedprice)
val
countOrder
=
g.Count
})
OrderBy(_.returnFlag)
ThenBy(_.lineStatus)
17. n A DSL for large-scale graph analysis based
on Green-Marl
Green-Marl: A DSL for Easy and Efficient Graph Analysis (Hong et. al.), ASPLOS ’12
n Directed and undirected graphs, nodes,
edges
n Collections for node/edge storage
n Set, sequence, order
n Deferred assignment and parallel reductions
with bulk synchronous consistency
18. Implicitly parallel iteration
for(t
<-‐
G.Nodes)
{
val
rank
=
((1.0
d)/
N)
+
d
*
Sum(t.InNbrs){w
=>
PR(w)
/
w.OutDegree}
PR
<=
(t,rank)
diff
+=
Math.abs(rank
-‐
PR(t))
}
Deferred assignment and scalar reduction
Writes become visible after the loop completes
19. n A port of a subset of Scala collections to a
staged Delite DSL
n Demonstrates the benefits of high-level
optimization and code generation
val
sourcedests
=
pagelinks
flatMap
{
l
=>
val
sd
=
l.split(":")
val
source
=
Long.parseLong(sd(0))
Tuples
val
dests
=
sd(1).trim.split("
")
encoded
dests.map(d
=>
(Integer.parseInt(d),
source))
as longs
}
in back-
val
inverted
=
sourcedests
groupBy
(x
=>
x._1)
end
Reverse web-link benchmark in OptiCollections
26. 1. Types
n abstract, front-end
2. Operations
n language operators and methods available on types;
represented by IR nodes
3. Data Structures
n platform-specific concrete implementation, back-end
4. Code Generators
n Scala traits that define how to emit code as strings for
various IR nodes and platforms
5. Analyses and Optimizations (Optional)
n IR rewriting via pattern matching, traversals/transformations
(e.g. fusion)
27. abstract
class
Vector[T]
extends
DeliteCollection[T]
abstract
class
Matrix[T]
extends
DeliteCollection[T]
abstract
class
Image[T]
extends
Matrix[T]
placeholders for static type
checking and method dispatch;
not bound to any implementation
28. The same abstract
trait
VectorOps
{
Vector we defined earlier
//
add
an
infix
+
operator
to
Rep[Vector[A]]
def
infix_+(lhs:
Rep[Vector[A]],
rhs:
Rep[Vector[A]])
=
vector_plus(lhs,
rhs)
//
abstract,
applications
cannot
inspect
what
happens
//
when
methods
are
called
def
vector_length(lhs:
Rep[Vector[A]]):
Rep[Int]
def
vector_plus(lhs:
Rep[Vector[A]],
rhs:
Rep[Vector[A]]):
Rep[Vector[A]]
}
29. trait
VectorOpsExp
extends
VectorOps
with
Expressions
{
//
a
Delite
parallel
op
IR
node
case
class
VectorPlus(inA:
Exp[Vector[A]],
inB:
Exp[Vector[A]])
extends
DeliteOpZipWith[Vector[A],
Vector[A],
Vector[A]]
{
//
number
of
elements
in
the
input
collections
def
size
=
inA.length
//
the
output
collection
def
alloc
=
Vector[A](inA.length)
//
the
ZipWith
function
def
func
=
(a,b)
=>
a
+
b
}
//
construct
IR
nodes
def
vector_plus(lhs:
Exp[Vector[A]],
rhs:
Exp[Vector[A]])
=
VectorPlus(lhs,
rhs)
}
30. //
a
concrete,
back-‐end
Scala
data
structure
//
will
be
instantiated
by
generated
code
class
Vector[T](__length:
Int)
{
var
_length
=
__length
var
_data:
Array[T]
=
new
Array[T](_length)
}
//
corresponding
data
structures
for
other
back-‐ends
//
(CUDA,
OpenCL,
etc.)
//
.
.
.
31. trait
ScalaGenVectorOps
extends
ScalaGen
{
val
IR:
VectorOpsExp
import
IR._
override
def
emitNode(sym:
Sym[Any],
rhs:
Def[Any])
(implicit
stream:
PrintWriter)
=
//
generate
code
for
particular
IR
nodes
rhs
match
{
The exact
case
v@VectorNew(length)
=>
back-end field
emitValDef(sym,
“new
"
+
remap("Vector")+"("
+
quote(length)
+
")")
name we
case
VectorLength(x)
=>
defined earlier
emitValDef(sym,
quote(x)
+
".
_length")
case
_
=>
super.emitNode(sym,
rhs)
}
}
32. override
def
matrix_plus[A:Manifest:Arith]
(x:
Exp[Matrix[A]],
y:
Exp[Matrix[A]])
=
(x,
y)
match
{
//
(AB
+
AD)
==
A(B
+
D)
case
(Def(MatrixTimes(a,
b)),
Def(MatrixTimes(c,
d)))
if
(a
==
c)
=>
//
return
optimized
version
matrix_times(a,
matrix_plus(b,d))
//
other
rewrites
//
case
.
.
.
case
_
=>
super.matrix_plus(x,
y)
}
33. trait
OptiML
extends
OptiMLScalaOpsPkg
with
VectorOps
with
MatrixOps
with
...
trait
OptiMLExp
extends
OptiMLScalaOpsPkgExp
with
VectorOpsExp
with
MatrixOpsExp
with
...
trait
OptiMLCodeGenScala
extends
OptiMLScalaCodeGenPkg
with
ScalaGenVectorOps
with
ScalaGenMatrixOps
with
...
trait
OptiMLCodeGenCuda
extends
OptiMLCudaCodeGenPkg
with
CudaGenVectorOps
with
CudaGenMatrixOps
with
...
34. n Delite DSLs target high performance
architectures from Scala
n Open source – use them to accelerate
your apps or build your own!
n http://github.com/stanford-ppl/Delite
n Mailing List:
n http://groups.google.com/group/delite-devel
n Thank you