SlideShare a Scribd company logo
© 2019 Composewell Technologies
Streamly:
Concurrent Data Flow
Programming
Harendra Kumar
15 Nov 2019
Composewell Technologies
© 2019 Composewell Technologies
Ergonomics of Shell
Safety of Haskell
Speed of C
Magical Concurrency
About Streamly
https://github.com/composewell/streamly
© 2019 Composewell Technologies
About Me
• C programming
• OS kernel, file systems
• Haskell since 2015
harendra@composewell.com
@hk_hooda
© 2019 Composewell Technologies
About Composewell
• Develops Streamly
• Provides commercial support
• Designs solutions using Streamly
• Provides training on Streamly
http://www.composewell.com
hello@composewell.com
© 2019 Composewell Technologies
“Simplicity is a great virtue but it
requires hard work to achieve it
and education to appreciate it. And
to make matters worse: complexity
sells better.”
— Edsger Wybe Dijkstra
© 2019 Composewell Technologies
Streams
© 2019 Composewell Technologies
What is a stream?
• A sequence of same type of items
• Combinators to process the sequence
© 2019 Composewell Technologies
Pure Effectful
List
Stream
(Effectful List)
© 2019 Composewell Technologies
Imperative Functional
Loop
Stream
(Modular Loops)
© 2019 Composewell Technologies
Do I need streams?
• Imperative version: Do I need loops?
© 2019 Composewell Technologies
Streamly
=
Efficient, Composable and
Concurrent Loops
© 2019 Composewell Technologies
Who should use Streamly?
• General purpose framework
• Declarative concurrency
• High performance, nearing or beating C
• Some examples:
• Server backends
• Reactive programming
• Real time data analysis
• Streaming data applications
© 2019 Composewell Technologies
Streamly At a Glance
© 2019 Composewell Technologies
Engineering Focused Library
(Goals)
Ergonomics Python and Shell
Performance C
Composability Haskell
Safety Haskell
Simplicity Use only basic Haskell
© 2019 Composewell Technologies
Composability
+
Performance
© 2019 Composewell Technologies
Streamly Version
• Examples in this presentation use
streamly-0.7.0
• Some examples may use APIs from
Streamly.Internal.* modules
• Source code for examples can be found at:
https://github.com/composewell/streamly-examples
© 2019 Composewell Technologies
Fundamental Operations
(Conceptual)
Operation Shape
Generate a -> Stream m b
Transform Stream m a -> Stream m b
Eliminate Stream m a -> m b
© 2019 Composewell Technologies
Fundamental Data Types
(Actual)
Type Constructor
Unfold m a b forall s. Unfold (s -> m (Step s b)) (a -> m s)
state generate inject
Stream m a forall s. Stream (s -> m (Step s a)) s
state generate state
Fold m a b forall s. Fold (s -> a -> m s) (m s) (s -> m b)
state accumulate initial extract
© 2019 Composewell Technologies
Core Modules
Type Module Abbrev.
Unfold Streamly.Data.Unfold UF
Stream Streamly.Prelude S
Fold Streamly.Data.Fold FL
© 2019 Composewell Technologies
unfold and fold Functions
• S.unfold :: Unfold m a b -> a -> Stream m b
• S.fold :: Fold m b c -> Stream m b -> m c
© 2019 Composewell Technologies
unfold & fold
fold
S.unfold UF.fromList [1..10] & S.fold FL.sum
unfold
© 2019 Composewell Technologies
unfold & fold = Action
fold
a -> m c
unfolda m cStream m b
© 2019 Composewell Technologies
unfold & fold = Loop!
a -> m c
a m cStream m b
© 2019 Composewell Technologies
Stream Types
Type Coercion Modifier
SerialT m a serially
AsyncT m a asyncly
AheadT m a aheadly
© 2019 Composewell Technologies
Summing an Int Stream
S.unfold UF.fromList [1..10] -- SerialT Identity Int
& S.fold FL.sum -- Identity Int
© 2019 Composewell Technologies
Summing an Int Stream
(Better Ergonomics)
S.fromList [1..10] -- SerialT Identity Int
& S.sum -- Identity Int
fromList = S.unfold UF.fromList [1..10]
sum = S.fold FL.sum
© 2019 Composewell Technologies
File IO Examples
© 2019 Composewell Technologies
IO Modules
Module Abbrev.
Streamly.FileSystem.Handle FH
Streamly.FileSystem.File File
Streamly.Network.Socket SK
Streamly.Network.Inet.TCP TCP
Streamly.Data.Unicode.Stream U
© 2019 Composewell Technologies
cat
File.toChunks “inFile” -- SerialT IO (Array Word8)
& FH.putChunks -- IO ()
© 2019 Composewell Technologies
File Copy (cp)
File.toChunks “inFile” -- SerialT IO (Array Word8)
& File.fromChunks “outFile” -- IO ()
© 2019 Composewell Technologies
Count bytes in a File
(wc -c)
File.toBytes “inFile” -- SerialT IO Word8
& S.fold FL.length -- IO Int
© 2019 Composewell Technologies
Count Lines in a File
(wc -l)
File.toBytes “inFile” -- SerialT IO Word8
& S.fold nlines -- IO Int
countl :: Int -> Word8 -> Int
countl n ch = if (ch == 10) then n + 1 else
n
nlines :: Monad m => Fold m Word8 Int
nlines = FL.mkPure countl 0 id
© 2019 Composewell Technologies
Count Words in a File
(wc -w)
File.toBytes “inFile” -- SerialT IO Word8
& S.fold nwords -- IO Int
countw :: (Int, Bool) -> Word8 -> (Int, Bool)
countw (n, wasSpace) ch =
if (isSpace $ chr $ fromIntegral ch)
then (n, True)
else (if wasSpace then n + 1 else n, False)
nwords :: Monad m => Fold m Word8 Int
nwords = FL.mkPure countw (0, True) fst
© 2019 Composewell Technologies
Splitting a Stream
(Composing Folds)
© 2019 Composewell Technologies
Count Bytes, Lines & Words
(wc -clw)
File.toBytes “inFile” -- SerialT IO Word8
& S.fold ((,,) <$> nlines <*> nwords <*> FL.length)
-- IO (Int, Int, Int)
© 2019 Composewell Technologies
Sending output to multiple
files (tee)
File.toBytes “inFile” -- SerialT IO Word8
& S.fold (FL.tee (File.write “outFile1”)
(File.write “outFile2"))
-- IO ((),())
© 2019 Composewell Technologies
Splitting a file
(split)
type SIO = StateT (Maybe (Handle, Int)) IO
splitFile :: FH.Handle -> IO ()
splitFile inHandle =
File.toBytes “inFile” -- SerialT IO Word8
& S.liftInner -- SerialT SIO Word8
& S.chunksOf2 64 newHandle FH.write2 -- SerialT SIO Word8
& S.evalStateT Nothing -- SerialT IO ()
& S.drain -- IO ()
Uses experimental APIs
© 2019 Composewell Technologies
Transformation
Pipeline
© 2019 Composewell Technologies
Word Classifier
File.toBytes “inFile” -- SerialT IO Word8
& S.decodeLatin1 -- SerialT IO Char
& S.map toLower -- SerialT IO Char
& S.words FL.toList -- SerialT IO String
& S.filter (all isAlpha) -- SerialT IO String
& toHashMap -- IO (Map String (IORef Int))
See https://github.com/composewell/streamly/blob/master/examples/WordClassifier.hs
Adapted from an example by Patrick Thomson
alter Nothing = fmap Just $ newIORef (1 :: Int)
alter (Just ref) = do
modifyIORef' ref (+ 1)
return (Just ref)
toHashMap = S.foldlM' (flip (Map.alterF alter)) Map.empty
© 2019 Composewell Technologies
Word Size Histogram
bucket :: Int -> (Int, Int)
bucket n = let i = n `mod` 10
in if i > 9 then (9,n) else (i,n)
File.toBytes “inFile” -- SerialT IO Word8
& S.words FL.length -- SerialT IO Int
& S.map bucket -- SerialT IO (Int, Int)
& S.fold (FL.classify FL.length) -- IO (Map Int Int)
classify directs (k,v) stream to a Map applying the length fold to the stream of values in each bucket
© 2019 Composewell Technologies
Debugging a Pipeline
(trace/tap)
File.toBytes “inFile” -- SerialT IO Word8
& S.words FL.length -- SerialT IO Int
& S.map bucket -- SerialT IO (Int, Int)
& S.trace print -- SerialT IO (Int, Int)
& S.fold (FL.classify FL.length) -- IO (Map Int Int)
classify directs (k,v) stream to a Map applying the length fold to the stream of values in each bucket
© 2019 Composewell Technologies
Combining Streams
(Composing Unfolds)
© 2019 Composewell Technologies
Appending N Streams
(cat dir/* > outfile)
Dir.toFiles dirname -- SerialT IO String
& S.concatUnfold File.read -- SerialT IO Word8
& File.fromBytes “outFile” -- IO()
© 2019 Composewell Technologies
Outer Product
(Nested Loops)
mult :: (Int, Int) -> Int
mult (x, y) = x * y
from :: Monad m => Unfold m Int Int
from = UF.enumerateFromToIntegral 1000
cross :: Monad m => Unfold m (Int, Int) Int
cross =
UF.outerProduct from from
& UF.map mult
UF.fold cross FL.sum (1,1)
© 2019 Composewell Technologies
Better Replacement for ListT
and LogicT
loops :: SerialT IO ()
loops = do
x <- S.fromList [1,2]
y <- S.fromList [3,4]
S.yieldM $ putStrLn $ show (x, y)
(1,3)
(1,4)
(2,3)
(2,4)
© 2019 Composewell Technologies
Declarative
Concurrency
© 2019 Composewell Technologies
Lookup words
get :: String -> IO String
get s = liftIO (httpNoBody (parseRequest_ s)) >> return s
fetch :: String -> IO (String, String)
fetch w =
(,) <$> pure w <*> get (“https://www.google.com/search?q=“ ++ w)
wordList :: [String]
wordList = [“cat”, “dog”, “mouse”]
meanings :: [IO (String, String)]
meanings = map fetch wordList
© 2019 Composewell Technologies
Serially
S.fromListM meanings -- SerialT IO (String, String)
& S.map show -- SerialT IO String
& FH.putStrings -- IO ()
© 2019 Composewell Technologies
Asynchronously
S.fromListM meanings -- AsyncT IO (String, String)
& asyncly -- SerialT IO (String, String)
& S.map show -- SerialT IO String
& FH.putStrings — IO ()
© 2019 Composewell Technologies
Speculatively
(Look Ahead)
S.fromListM meanings -- AheadT IO (String, String)
& aheadly -- SerialT IO (String, String)
& S.map show -- SerialT IO String
& FH.putStrings -- IO ()
© 2019 Composewell Technologies
Word Lookup Server
S.unfold TCP.acceptOnPort 8090 -- SerialT IO Socket
& S.serially -- AsyncT IO ()
& S.mapM serve -- AsyncT IO ()
& S.asyncly -- SerialT IO ()
& S.drain -- IO ()
lookupWords :: Socket -> IO ()
lookupWords sk =
S.unfold SK.read sk -- SerialT IO Word8
& U.decodeLatin1 -- SerialT IO Char
& U.words FL.toList -- SerialT IO String
& S.serially -- AheadT IO String
& S.mapM fetch -- AheadT IO (String, String)
& S.aheadly -- SerialT IO (String, String)
& S.map show -- SerialT IO String
& S.intercalateSuffix "n" UF.identity -- SerialT IO String
& S.fold (SK.writeStrings sk) — IO ()
serve :: Socket -> IO ()
serve sk = finally (lookupWords sk) (close sk)
© 2019 Composewell Technologies
Rate Control Req/Sec
lookupWords :: Socket -> IO ()
lookupWords sk =
S.unfold SK.read sk -- SerialT IO Word8
& U.decodeLatin1 -- SerialT IO Char
& U.words FL.toList -- SerialT IO String
& serially -- AheadT IO String
& S.mapM lookup -- AheadT IO (String, String)
& S.maxRate 10 -- AheadT IO (String, String)
& S.aheadly -- SerialT IO (String, String)
& S.map Show -- SerialT IO String
& S.fold Sk.write sk -- IO ()
© 2019 Composewell Technologies
Rate Control Conns/Sec
S.unfold TCP.acceptOnPort 8090 -- SerialT IO Socket
& S.serially -- AsyncT IO ()
& S.mapM serve -- AsyncT IO ()
& S.maxRate 10 -- AsyncT IO ()
& S.asyncly -- SerialT IO ()
& S.drain -- IO ()
© 2019 Composewell Technologies
Merging Live Word Streams
S.unfold TCP.acceptOnPort 8090 -- SerialT IO Socket
& S.concatMapWith S.parallel recv -- SerialT IO String
& U.unwords UF.fromList -- SerialT IO Char
& U.encodeLatin1 -- SerialT IO Word8
& File.fromBytes “outFile” -- IO ()
readWords :: Socket -> SerialT IO String
readWords sk =
S.unfold SK.read sk -- SerialT IO Word8
& U.decodeLatin1 -- SerialT IO Char
& U.words FL.toList -- SerialT IO String
recv :: Socket -> SerialT IO String
recv sk = S.finally (liftIO $ close sk) (readWords sk)
© 2019 Composewell Technologies
Recursive Directory Listing
Concurrently
listDir :: Either String String -> AheadT IO String
listDir (Left dir) =
Dir.toEither dir -- SerialT IO (Either String String)
& S.map (prefixDir dir) -- SerialT IO (Either String String)
& S.consM (return dir)
. S.concatMapWith ahead listDir -- SerialT IO String
listDir (Right file) = S.yield file -- SerialT IO String
S.mapM_ print $ aheadly $ listDir (Left ".")
© 2019 Composewell Technologies
Demand Scaled
Concurrency
• No threads if no one is consuming the stream
• Concurrency increases as consuming rate
increases
• maxThreads and maxBuffer can control the
limits
© 2019 Composewell Technologies
Concurrent Folds
(Consume Concurrently)
© 2019 Composewell Technologies
Write Concurrently to multiple Destinations
FH.getBytes -- SerialT IO Word8
& S.tapAsync (TCP.fromBytes (192,168,1,10) 8091) -- SerialT IO Word8
& S.tapAsync (TCP.fromBytes (192,168,1,11) 8091) -- SerialT IO Word8
& File.fromBytes “outFile” -- IO ()
© 2019 Composewell Technologies
Concurrent ListT
(Nested Loops)
© 2019 Composewell Technologies
Non-determinism
(Looping)
loops = $ do
x <- each [1,2]
y <- each [3,4]
liftIO $ putStrLn $ show (x, y)
main = S.drain $ serially $ loops
main = S.drain $ asyncly $ loops
main = S.drain $ aheadly $ loops
© 2019 Composewell Technologies
Streaming + Concurrency
=
Reactive Programming
© 2019 Composewell Technologies
Reactive Programming
• Reactive programs (games, GUI) can be elegantly
expressed by declarative concurrency.
• See the Acid Rain game example in the package
• See the Circling Square example from Yampa, in
the package
https://github.com/composewell/streamly/blob/master/examples/AcidRain.hs
https://github.com/composewell/streamly/blob/master/examples/CirclingSquare.hs
© 2019 Composewell Technologies
Performance
© 2019 Composewell Technologies
Micro Benchmarks (GHC 8.8.1)
• A stream of 1 million elements is generated
• unfoldrM is used to generate the stream
• Two types of operations on the stream are
measured:
• single operation applied once
• a mix of operations applied multiple times
• Compiled using GHC 8.8.1
• All benchmarks are single threaded
• Ran on MacBook Pro with Intel Core i7 processor
• Because there is a lot of variance, the comparison
is in multiples rather than as percentage diff.
© 2019 Composewell Technologies
Comparison with Haskell lists
(GHC-8.8.1) (time)
© 2019 Composewell Technologies
Comparison with lists (GHC-8.8.1)
(Micro Benchmarks)
• List is slower than streamly in most operations, the
worse is 150 times slow.
• Streamly is slower than lists for concatMap and
append operations.
• There is no significant difference in memory
consumption.
© 2019 Composewell Technologies
Comparison with streaming libraries
(time)
streaming-0.2.3.0
conduit-1.3.1.1
pipes-4.3.12
© 2019 Composewell Technologies
Comparison with streaming libraries
(memory)
streaming-0.2.3.0
conduit-1.3.1.1
pipes-4.3.12
© 2019 Composewell Technologies
Comparison with streaming libraries
• All libraries are significantly slower (ranging from
1.2x to 1100x) than streamly for all operations.
• Streaming and streamly both consistently utilize
the same amount of memory across all ops.
• Conduit and pipes spike up to 32x memory in
certain operations.
© 2019 Composewell Technologies
Comparison With C
© 2019 Composewell Technologies
Counting Words in C
(The State)
struct statfs fsb;
uintmax_t linect, wordct, charct;
int fd, len;
short gotsp;
uint8_t *p;
uint8_t *buf;
linect = wordct = charct = 0;
if ((fd = open(argv[1], O_RDONLY, 0)) < 0) {
perror("open");
exit(EXIT_FAILURE);
}
if (fstatfs(fd, &fsb)) {
perror("fstatfs");
exit(EXIT_FAILURE);
}
buf = malloc(fsb.f_bsize);
if (!buf) {
perror("malloc");
exit(EXIT_FAILURE);
}
gotsp = 1;
© 2019 Composewell Technologies
Counting Words in C
(The Logic)
while ((len = read(fd, buf, fsb.f_bsize)) != 0) {
if (len == -1) {
perror("read");
exit(EXIT_FAILURE);
}
p = buf;
…
}
while (len > 0) {
uint8_t ch = *p;
charct++;
len -= 1;
p += 1;
if (ch == 'n')
++linect;
if (isspace(ch))
gotsp = 1;
else if (gotsp) {
gotsp = 0;
++wordct;
}
}
}
© 2019 Composewell Technologies
Counting Words in Haskell
data WordCount = WordCount !Int !Bool
data Counts = Counts !Int !Int !WordCount
initialCounts = Counts 0 0 (WordCount 0 True)
countl :: Int -> Word8 -> Int
countl n ch = if (ch == 10) then n + 1 else n
countw :: WordCount -> Word8 -> WordCount
countw (WordCount n wasSpace) ch =
if (isSpace $ chr $ fromIntegral ch)
then WordCount n True
else WordCount (if wasSpace then n + 1 else n) False
{-# INLINE updateCounts #-}
updateCounts :: Counts -> Word8 -> Counts
updateCounts (Counts c l w) ch = Counts (c + 1) (countl l ch) (countw w ch)
wc :: Handle -> IO Counts
wc h =
S.unfold FH.read h -- SerialT IO Char
& S.foldl' updateCounts initialCounts -- IO Counts
© 2019 Composewell Technologies
Word Counting: C vs Haskell
(550 MB text file)
C Haskell
2.42 Second 2.17 Second
© 2019 Composewell Technologies
Can Haskell be as fast as C?
• Each Haskell combinator represents a small
piece of the loop
• The programmer composes the loop using
these combinators.
• GHC fuses these pieces together (stream
fusion) to create a monolithic loop like C.
• Finally, the structure of the optimized code
churned out by GHC is like the C code, with the
both the loops as you see in the C program.
• This is global program optimization, an efficient
big picture is created using smaller pieces.
• GCC can only perform low level optimization
© 2019 Composewell Technologies
How can GHC perform
global optimizations?
• Strong types and purity makes equational
reasoning possible.
• This allows GHC to reliably perform
transformations over the code and fuse parts of
the code to generate efficient code.
• Global program optimization is not possible in
C.
© 2019 Composewell Technologies
What are the downsides?
• Stream fusion depends on inlining and SPEC
constructor optimizations.
• Often careful INLINE annotations are needed.
• Higher order functions require INLINE in the
“right” phase.
• GHC may need more work to perform it fully
reliably
• Due to global optimization compilation is slower
and may get slower as the size of the program
increases.
© 2019 Composewell Technologies
Streamly GHC Plugin For
Fusion
• Internal join bindings beyond a size threshold
are not inlined, blocking fusion.
• We mark stream constructors with a pragma to
identify them in the core.
• Join points with such constructors are inlined
irrespective of the size.
• This allows more reliable fusion.
© 2019 Composewell Technologies
The Project
© 2019 Composewell Technologies
Current State of The Project
• ~25K LOC, ~16K Doc, ~95 files, 18 contributors
• High quality, tested, production capable
• Some parts of the API may change in near future
• type names may change
• module structure may change
© 2019 Composewell Technologies
Work In Progress
• Stream parsers
• Concurrent folds
• Splitting and merging transformations
© 2019 Composewell Technologies
Roadmap
• Shared concurrent state
• Persistent queues
• Vector instructions
• Distributed processing
• Lot more stuff
© 2019 Composewell Technologies
Thank You!
harendra@composewell.com
twitter: @hk_hooda
https://github.com/composewell/streamly
https://gitter.im/composewell/streamly

More Related Content

Similar to Streamly: Concurrent Data Flow Programming

Driving Overall Equipment Effectiveness with AWS IoT SiteWise - SVC213 - Chic...
Driving Overall Equipment Effectiveness with AWS IoT SiteWise - SVC213 - Chic...Driving Overall Equipment Effectiveness with AWS IoT SiteWise - SVC213 - Chic...
Driving Overall Equipment Effectiveness with AWS IoT SiteWise - SVC213 - Chic...
Amazon Web Services
 
Chaos Testing with F# and Azure by Rachel Reese at Codemotion Dubai
Chaos Testing with F# and Azure by Rachel Reese at Codemotion DubaiChaos Testing with F# and Azure by Rachel Reese at Codemotion Dubai
Chaos Testing with F# and Azure by Rachel Reese at Codemotion Dubai
Codemotion Dubai
 
Why you care about
 relational algebra (even though you didn’t know it)
Why you care about
 relational algebra (even though you didn’t know it)Why you care about
 relational algebra (even though you didn’t know it)
Why you care about
 relational algebra (even though you didn’t know it)
Julian Hyde
 
Swift: Apple's New Programming Language for iOS and OS X
Swift: Apple's New Programming Language for iOS and OS XSwift: Apple's New Programming Language for iOS and OS X
Swift: Apple's New Programming Language for iOS and OS X
Sasha Goldshtein
 
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
InfluxData
 
Declarative UIs with Jetpack Compose
Declarative UIs with Jetpack ComposeDeclarative UIs with Jetpack Compose
Declarative UIs with Jetpack Compose
Ramon Ribeiro Rabello
 
JavaZone 2022 - Building Kotlin DSL.pdf
JavaZone 2022 - Building Kotlin DSL.pdfJavaZone 2022 - Building Kotlin DSL.pdf
JavaZone 2022 - Building Kotlin DSL.pdf
Anton Arhipov
 
Js foo - Sept 8 upload
Js foo - Sept 8 uploadJs foo - Sept 8 upload
Js foo - Sept 8 upload
Debnath Sinha
 
Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...
Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...
Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...
huguk
 
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
Altinity Ltd
 
Ato2019 weave-services-istio
Ato2019 weave-services-istioAto2019 weave-services-istio
Ato2019 weave-services-istio
Lin Sun
 
Weave Your Microservices with Istio
Weave Your Microservices with IstioWeave Your Microservices with Istio
Weave Your Microservices with Istio
All Things Open
 
All Things Open 2019 weave-services-istio
All Things Open 2019 weave-services-istioAll Things Open 2019 weave-services-istio
All Things Open 2019 weave-services-istio
Lin Sun
 
MS Day EPITA 2010: Visual Studio 2010 et Framework .NET 4.0
MS Day EPITA 2010: Visual Studio 2010 et Framework .NET 4.0MS Day EPITA 2010: Visual Studio 2010 et Framework .NET 4.0
MS Day EPITA 2010: Visual Studio 2010 et Framework .NET 4.0Thomas Conté
 
Advanced kapacitor
Advanced kapacitorAdvanced kapacitor
Advanced kapacitor
InfluxData
 
Modern-Application-Design-with-Amazon-ECS
Modern-Application-Design-with-Amazon-ECSModern-Application-Design-with-Amazon-ECS
Modern-Application-Design-with-Amazon-ECS
Amazon Web Services
 
Unite Copehagen 2019 - Unity Roadmap 2020
Unite Copehagen 2019 - Unity Roadmap 2020Unite Copehagen 2019 - Unity Roadmap 2020
Unite Copehagen 2019 - Unity Roadmap 2020
Unity Technologies
 
What to expect in 2020: Unity roadmap - Unite Copenhagen 2019
What to expect in 2020: Unity roadmap - Unite Copenhagen 2019What to expect in 2020: Unity roadmap - Unite Copenhagen 2019
What to expect in 2020: Unity roadmap - Unite Copenhagen 2019
Unity Technologies
 
InfluxDB IOx Tech Talks: A Rusty Introduction to Apache Arrow and How it App...
InfluxDB IOx Tech Talks:  A Rusty Introduction to Apache Arrow and How it App...InfluxDB IOx Tech Talks:  A Rusty Introduction to Apache Arrow and How it App...
InfluxDB IOx Tech Talks: A Rusty Introduction to Apache Arrow and How it App...
InfluxData
 
State of the Platforms
State of the PlatformsState of the Platforms
State of the Platforms
Sasha Goldshtein
 

Similar to Streamly: Concurrent Data Flow Programming (20)

Driving Overall Equipment Effectiveness with AWS IoT SiteWise - SVC213 - Chic...
Driving Overall Equipment Effectiveness with AWS IoT SiteWise - SVC213 - Chic...Driving Overall Equipment Effectiveness with AWS IoT SiteWise - SVC213 - Chic...
Driving Overall Equipment Effectiveness with AWS IoT SiteWise - SVC213 - Chic...
 
Chaos Testing with F# and Azure by Rachel Reese at Codemotion Dubai
Chaos Testing with F# and Azure by Rachel Reese at Codemotion DubaiChaos Testing with F# and Azure by Rachel Reese at Codemotion Dubai
Chaos Testing with F# and Azure by Rachel Reese at Codemotion Dubai
 
Why you care about
 relational algebra (even though you didn’t know it)
Why you care about
 relational algebra (even though you didn’t know it)Why you care about
 relational algebra (even though you didn’t know it)
Why you care about
 relational algebra (even though you didn’t know it)
 
Swift: Apple's New Programming Language for iOS and OS X
Swift: Apple's New Programming Language for iOS and OS XSwift: Apple's New Programming Language for iOS and OS X
Swift: Apple's New Programming Language for iOS and OS X
 
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
 
Declarative UIs with Jetpack Compose
Declarative UIs with Jetpack ComposeDeclarative UIs with Jetpack Compose
Declarative UIs with Jetpack Compose
 
JavaZone 2022 - Building Kotlin DSL.pdf
JavaZone 2022 - Building Kotlin DSL.pdfJavaZone 2022 - Building Kotlin DSL.pdf
JavaZone 2022 - Building Kotlin DSL.pdf
 
Js foo - Sept 8 upload
Js foo - Sept 8 uploadJs foo - Sept 8 upload
Js foo - Sept 8 upload
 
Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...
Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...
Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...
 
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
 
Ato2019 weave-services-istio
Ato2019 weave-services-istioAto2019 weave-services-istio
Ato2019 weave-services-istio
 
Weave Your Microservices with Istio
Weave Your Microservices with IstioWeave Your Microservices with Istio
Weave Your Microservices with Istio
 
All Things Open 2019 weave-services-istio
All Things Open 2019 weave-services-istioAll Things Open 2019 weave-services-istio
All Things Open 2019 weave-services-istio
 
MS Day EPITA 2010: Visual Studio 2010 et Framework .NET 4.0
MS Day EPITA 2010: Visual Studio 2010 et Framework .NET 4.0MS Day EPITA 2010: Visual Studio 2010 et Framework .NET 4.0
MS Day EPITA 2010: Visual Studio 2010 et Framework .NET 4.0
 
Advanced kapacitor
Advanced kapacitorAdvanced kapacitor
Advanced kapacitor
 
Modern-Application-Design-with-Amazon-ECS
Modern-Application-Design-with-Amazon-ECSModern-Application-Design-with-Amazon-ECS
Modern-Application-Design-with-Amazon-ECS
 
Unite Copehagen 2019 - Unity Roadmap 2020
Unite Copehagen 2019 - Unity Roadmap 2020Unite Copehagen 2019 - Unity Roadmap 2020
Unite Copehagen 2019 - Unity Roadmap 2020
 
What to expect in 2020: Unity roadmap - Unite Copenhagen 2019
What to expect in 2020: Unity roadmap - Unite Copenhagen 2019What to expect in 2020: Unity roadmap - Unite Copenhagen 2019
What to expect in 2020: Unity roadmap - Unite Copenhagen 2019
 
InfluxDB IOx Tech Talks: A Rusty Introduction to Apache Arrow and How it App...
InfluxDB IOx Tech Talks:  A Rusty Introduction to Apache Arrow and How it App...InfluxDB IOx Tech Talks:  A Rusty Introduction to Apache Arrow and How it App...
InfluxDB IOx Tech Talks: A Rusty Introduction to Apache Arrow and How it App...
 
State of the Platforms
State of the PlatformsState of the Platforms
State of the Platforms
 

Recently uploaded

Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
AMB-Review
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
Donna Lenk
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
abdulrafaychaudhry
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
timtebeek1
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
Globus
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Neo4j
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Crescat
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
Globus
 
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptxText-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
ShamsuddeenMuhammadA
 

Recently uploaded (20)

Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
 
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptxText-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
 

Streamly: Concurrent Data Flow Programming

  • 1. © 2019 Composewell Technologies Streamly: Concurrent Data Flow Programming Harendra Kumar 15 Nov 2019 Composewell Technologies
  • 2. © 2019 Composewell Technologies Ergonomics of Shell Safety of Haskell Speed of C Magical Concurrency About Streamly https://github.com/composewell/streamly
  • 3. © 2019 Composewell Technologies About Me • C programming • OS kernel, file systems • Haskell since 2015 harendra@composewell.com @hk_hooda
  • 4. © 2019 Composewell Technologies About Composewell • Develops Streamly • Provides commercial support • Designs solutions using Streamly • Provides training on Streamly http://www.composewell.com hello@composewell.com
  • 5. © 2019 Composewell Technologies “Simplicity is a great virtue but it requires hard work to achieve it and education to appreciate it. And to make matters worse: complexity sells better.” — Edsger Wybe Dijkstra
  • 6. © 2019 Composewell Technologies Streams
  • 7. © 2019 Composewell Technologies What is a stream? • A sequence of same type of items • Combinators to process the sequence
  • 8. © 2019 Composewell Technologies Pure Effectful List Stream (Effectful List)
  • 9. © 2019 Composewell Technologies Imperative Functional Loop Stream (Modular Loops)
  • 10. © 2019 Composewell Technologies Do I need streams? • Imperative version: Do I need loops?
  • 11. © 2019 Composewell Technologies Streamly = Efficient, Composable and Concurrent Loops
  • 12. © 2019 Composewell Technologies Who should use Streamly? • General purpose framework • Declarative concurrency • High performance, nearing or beating C • Some examples: • Server backends • Reactive programming • Real time data analysis • Streaming data applications
  • 13. © 2019 Composewell Technologies Streamly At a Glance
  • 14. © 2019 Composewell Technologies Engineering Focused Library (Goals) Ergonomics Python and Shell Performance C Composability Haskell Safety Haskell Simplicity Use only basic Haskell
  • 15. © 2019 Composewell Technologies Composability + Performance
  • 16. © 2019 Composewell Technologies Streamly Version • Examples in this presentation use streamly-0.7.0 • Some examples may use APIs from Streamly.Internal.* modules • Source code for examples can be found at: https://github.com/composewell/streamly-examples
  • 17. © 2019 Composewell Technologies Fundamental Operations (Conceptual) Operation Shape Generate a -> Stream m b Transform Stream m a -> Stream m b Eliminate Stream m a -> m b
  • 18. © 2019 Composewell Technologies Fundamental Data Types (Actual) Type Constructor Unfold m a b forall s. Unfold (s -> m (Step s b)) (a -> m s) state generate inject Stream m a forall s. Stream (s -> m (Step s a)) s state generate state Fold m a b forall s. Fold (s -> a -> m s) (m s) (s -> m b) state accumulate initial extract
  • 19. © 2019 Composewell Technologies Core Modules Type Module Abbrev. Unfold Streamly.Data.Unfold UF Stream Streamly.Prelude S Fold Streamly.Data.Fold FL
  • 20. © 2019 Composewell Technologies unfold and fold Functions • S.unfold :: Unfold m a b -> a -> Stream m b • S.fold :: Fold m b c -> Stream m b -> m c
  • 21. © 2019 Composewell Technologies unfold & fold fold S.unfold UF.fromList [1..10] & S.fold FL.sum unfold
  • 22. © 2019 Composewell Technologies unfold & fold = Action fold a -> m c unfolda m cStream m b
  • 23. © 2019 Composewell Technologies unfold & fold = Loop! a -> m c a m cStream m b
  • 24. © 2019 Composewell Technologies Stream Types Type Coercion Modifier SerialT m a serially AsyncT m a asyncly AheadT m a aheadly
  • 25. © 2019 Composewell Technologies Summing an Int Stream S.unfold UF.fromList [1..10] -- SerialT Identity Int & S.fold FL.sum -- Identity Int
  • 26. © 2019 Composewell Technologies Summing an Int Stream (Better Ergonomics) S.fromList [1..10] -- SerialT Identity Int & S.sum -- Identity Int fromList = S.unfold UF.fromList [1..10] sum = S.fold FL.sum
  • 27. © 2019 Composewell Technologies File IO Examples
  • 28. © 2019 Composewell Technologies IO Modules Module Abbrev. Streamly.FileSystem.Handle FH Streamly.FileSystem.File File Streamly.Network.Socket SK Streamly.Network.Inet.TCP TCP Streamly.Data.Unicode.Stream U
  • 29. © 2019 Composewell Technologies cat File.toChunks “inFile” -- SerialT IO (Array Word8) & FH.putChunks -- IO ()
  • 30. © 2019 Composewell Technologies File Copy (cp) File.toChunks “inFile” -- SerialT IO (Array Word8) & File.fromChunks “outFile” -- IO ()
  • 31. © 2019 Composewell Technologies Count bytes in a File (wc -c) File.toBytes “inFile” -- SerialT IO Word8 & S.fold FL.length -- IO Int
  • 32. © 2019 Composewell Technologies Count Lines in a File (wc -l) File.toBytes “inFile” -- SerialT IO Word8 & S.fold nlines -- IO Int countl :: Int -> Word8 -> Int countl n ch = if (ch == 10) then n + 1 else n nlines :: Monad m => Fold m Word8 Int nlines = FL.mkPure countl 0 id
  • 33. © 2019 Composewell Technologies Count Words in a File (wc -w) File.toBytes “inFile” -- SerialT IO Word8 & S.fold nwords -- IO Int countw :: (Int, Bool) -> Word8 -> (Int, Bool) countw (n, wasSpace) ch = if (isSpace $ chr $ fromIntegral ch) then (n, True) else (if wasSpace then n + 1 else n, False) nwords :: Monad m => Fold m Word8 Int nwords = FL.mkPure countw (0, True) fst
  • 34. © 2019 Composewell Technologies Splitting a Stream (Composing Folds)
  • 35. © 2019 Composewell Technologies Count Bytes, Lines & Words (wc -clw) File.toBytes “inFile” -- SerialT IO Word8 & S.fold ((,,) <$> nlines <*> nwords <*> FL.length) -- IO (Int, Int, Int)
  • 36. © 2019 Composewell Technologies Sending output to multiple files (tee) File.toBytes “inFile” -- SerialT IO Word8 & S.fold (FL.tee (File.write “outFile1”) (File.write “outFile2")) -- IO ((),())
  • 37. © 2019 Composewell Technologies Splitting a file (split) type SIO = StateT (Maybe (Handle, Int)) IO splitFile :: FH.Handle -> IO () splitFile inHandle = File.toBytes “inFile” -- SerialT IO Word8 & S.liftInner -- SerialT SIO Word8 & S.chunksOf2 64 newHandle FH.write2 -- SerialT SIO Word8 & S.evalStateT Nothing -- SerialT IO () & S.drain -- IO () Uses experimental APIs
  • 38. © 2019 Composewell Technologies Transformation Pipeline
  • 39. © 2019 Composewell Technologies Word Classifier File.toBytes “inFile” -- SerialT IO Word8 & S.decodeLatin1 -- SerialT IO Char & S.map toLower -- SerialT IO Char & S.words FL.toList -- SerialT IO String & S.filter (all isAlpha) -- SerialT IO String & toHashMap -- IO (Map String (IORef Int)) See https://github.com/composewell/streamly/blob/master/examples/WordClassifier.hs Adapted from an example by Patrick Thomson alter Nothing = fmap Just $ newIORef (1 :: Int) alter (Just ref) = do modifyIORef' ref (+ 1) return (Just ref) toHashMap = S.foldlM' (flip (Map.alterF alter)) Map.empty
  • 40. © 2019 Composewell Technologies Word Size Histogram bucket :: Int -> (Int, Int) bucket n = let i = n `mod` 10 in if i > 9 then (9,n) else (i,n) File.toBytes “inFile” -- SerialT IO Word8 & S.words FL.length -- SerialT IO Int & S.map bucket -- SerialT IO (Int, Int) & S.fold (FL.classify FL.length) -- IO (Map Int Int) classify directs (k,v) stream to a Map applying the length fold to the stream of values in each bucket
  • 41. © 2019 Composewell Technologies Debugging a Pipeline (trace/tap) File.toBytes “inFile” -- SerialT IO Word8 & S.words FL.length -- SerialT IO Int & S.map bucket -- SerialT IO (Int, Int) & S.trace print -- SerialT IO (Int, Int) & S.fold (FL.classify FL.length) -- IO (Map Int Int) classify directs (k,v) stream to a Map applying the length fold to the stream of values in each bucket
  • 42. © 2019 Composewell Technologies Combining Streams (Composing Unfolds)
  • 43. © 2019 Composewell Technologies Appending N Streams (cat dir/* > outfile) Dir.toFiles dirname -- SerialT IO String & S.concatUnfold File.read -- SerialT IO Word8 & File.fromBytes “outFile” -- IO()
  • 44. © 2019 Composewell Technologies Outer Product (Nested Loops) mult :: (Int, Int) -> Int mult (x, y) = x * y from :: Monad m => Unfold m Int Int from = UF.enumerateFromToIntegral 1000 cross :: Monad m => Unfold m (Int, Int) Int cross = UF.outerProduct from from & UF.map mult UF.fold cross FL.sum (1,1)
  • 45. © 2019 Composewell Technologies Better Replacement for ListT and LogicT loops :: SerialT IO () loops = do x <- S.fromList [1,2] y <- S.fromList [3,4] S.yieldM $ putStrLn $ show (x, y) (1,3) (1,4) (2,3) (2,4)
  • 46. © 2019 Composewell Technologies Declarative Concurrency
  • 47. © 2019 Composewell Technologies Lookup words get :: String -> IO String get s = liftIO (httpNoBody (parseRequest_ s)) >> return s fetch :: String -> IO (String, String) fetch w = (,) <$> pure w <*> get (“https://www.google.com/search?q=“ ++ w) wordList :: [String] wordList = [“cat”, “dog”, “mouse”] meanings :: [IO (String, String)] meanings = map fetch wordList
  • 48. © 2019 Composewell Technologies Serially S.fromListM meanings -- SerialT IO (String, String) & S.map show -- SerialT IO String & FH.putStrings -- IO ()
  • 49. © 2019 Composewell Technologies Asynchronously S.fromListM meanings -- AsyncT IO (String, String) & asyncly -- SerialT IO (String, String) & S.map show -- SerialT IO String & FH.putStrings — IO ()
  • 50. © 2019 Composewell Technologies Speculatively (Look Ahead) S.fromListM meanings -- AheadT IO (String, String) & aheadly -- SerialT IO (String, String) & S.map show -- SerialT IO String & FH.putStrings -- IO ()
  • 51. © 2019 Composewell Technologies Word Lookup Server S.unfold TCP.acceptOnPort 8090 -- SerialT IO Socket & S.serially -- AsyncT IO () & S.mapM serve -- AsyncT IO () & S.asyncly -- SerialT IO () & S.drain -- IO () lookupWords :: Socket -> IO () lookupWords sk = S.unfold SK.read sk -- SerialT IO Word8 & U.decodeLatin1 -- SerialT IO Char & U.words FL.toList -- SerialT IO String & S.serially -- AheadT IO String & S.mapM fetch -- AheadT IO (String, String) & S.aheadly -- SerialT IO (String, String) & S.map show -- SerialT IO String & S.intercalateSuffix "n" UF.identity -- SerialT IO String & S.fold (SK.writeStrings sk) — IO () serve :: Socket -> IO () serve sk = finally (lookupWords sk) (close sk)
  • 52. © 2019 Composewell Technologies Rate Control Req/Sec lookupWords :: Socket -> IO () lookupWords sk = S.unfold SK.read sk -- SerialT IO Word8 & U.decodeLatin1 -- SerialT IO Char & U.words FL.toList -- SerialT IO String & serially -- AheadT IO String & S.mapM lookup -- AheadT IO (String, String) & S.maxRate 10 -- AheadT IO (String, String) & S.aheadly -- SerialT IO (String, String) & S.map Show -- SerialT IO String & S.fold Sk.write sk -- IO ()
  • 53. © 2019 Composewell Technologies Rate Control Conns/Sec S.unfold TCP.acceptOnPort 8090 -- SerialT IO Socket & S.serially -- AsyncT IO () & S.mapM serve -- AsyncT IO () & S.maxRate 10 -- AsyncT IO () & S.asyncly -- SerialT IO () & S.drain -- IO ()
  • 54. © 2019 Composewell Technologies Merging Live Word Streams S.unfold TCP.acceptOnPort 8090 -- SerialT IO Socket & S.concatMapWith S.parallel recv -- SerialT IO String & U.unwords UF.fromList -- SerialT IO Char & U.encodeLatin1 -- SerialT IO Word8 & File.fromBytes “outFile” -- IO () readWords :: Socket -> SerialT IO String readWords sk = S.unfold SK.read sk -- SerialT IO Word8 & U.decodeLatin1 -- SerialT IO Char & U.words FL.toList -- SerialT IO String recv :: Socket -> SerialT IO String recv sk = S.finally (liftIO $ close sk) (readWords sk)
  • 55. © 2019 Composewell Technologies Recursive Directory Listing Concurrently listDir :: Either String String -> AheadT IO String listDir (Left dir) = Dir.toEither dir -- SerialT IO (Either String String) & S.map (prefixDir dir) -- SerialT IO (Either String String) & S.consM (return dir) . S.concatMapWith ahead listDir -- SerialT IO String listDir (Right file) = S.yield file -- SerialT IO String S.mapM_ print $ aheadly $ listDir (Left ".")
  • 56. © 2019 Composewell Technologies Demand Scaled Concurrency • No threads if no one is consuming the stream • Concurrency increases as consuming rate increases • maxThreads and maxBuffer can control the limits
  • 57. © 2019 Composewell Technologies Concurrent Folds (Consume Concurrently)
  • 58. © 2019 Composewell Technologies Write Concurrently to multiple Destinations FH.getBytes -- SerialT IO Word8 & S.tapAsync (TCP.fromBytes (192,168,1,10) 8091) -- SerialT IO Word8 & S.tapAsync (TCP.fromBytes (192,168,1,11) 8091) -- SerialT IO Word8 & File.fromBytes “outFile” -- IO ()
  • 59. © 2019 Composewell Technologies Concurrent ListT (Nested Loops)
  • 60. © 2019 Composewell Technologies Non-determinism (Looping) loops = $ do x <- each [1,2] y <- each [3,4] liftIO $ putStrLn $ show (x, y) main = S.drain $ serially $ loops main = S.drain $ asyncly $ loops main = S.drain $ aheadly $ loops
  • 61. © 2019 Composewell Technologies Streaming + Concurrency = Reactive Programming
  • 62. © 2019 Composewell Technologies Reactive Programming • Reactive programs (games, GUI) can be elegantly expressed by declarative concurrency. • See the Acid Rain game example in the package • See the Circling Square example from Yampa, in the package https://github.com/composewell/streamly/blob/master/examples/AcidRain.hs https://github.com/composewell/streamly/blob/master/examples/CirclingSquare.hs
  • 63. © 2019 Composewell Technologies Performance
  • 64. © 2019 Composewell Technologies Micro Benchmarks (GHC 8.8.1) • A stream of 1 million elements is generated • unfoldrM is used to generate the stream • Two types of operations on the stream are measured: • single operation applied once • a mix of operations applied multiple times • Compiled using GHC 8.8.1 • All benchmarks are single threaded • Ran on MacBook Pro with Intel Core i7 processor • Because there is a lot of variance, the comparison is in multiples rather than as percentage diff.
  • 65. © 2019 Composewell Technologies Comparison with Haskell lists (GHC-8.8.1) (time)
  • 66. © 2019 Composewell Technologies Comparison with lists (GHC-8.8.1) (Micro Benchmarks) • List is slower than streamly in most operations, the worse is 150 times slow. • Streamly is slower than lists for concatMap and append operations. • There is no significant difference in memory consumption.
  • 67. © 2019 Composewell Technologies Comparison with streaming libraries (time) streaming-0.2.3.0 conduit-1.3.1.1 pipes-4.3.12
  • 68. © 2019 Composewell Technologies Comparison with streaming libraries (memory) streaming-0.2.3.0 conduit-1.3.1.1 pipes-4.3.12
  • 69. © 2019 Composewell Technologies Comparison with streaming libraries • All libraries are significantly slower (ranging from 1.2x to 1100x) than streamly for all operations. • Streaming and streamly both consistently utilize the same amount of memory across all ops. • Conduit and pipes spike up to 32x memory in certain operations.
  • 70. © 2019 Composewell Technologies Comparison With C
  • 71. © 2019 Composewell Technologies Counting Words in C (The State) struct statfs fsb; uintmax_t linect, wordct, charct; int fd, len; short gotsp; uint8_t *p; uint8_t *buf; linect = wordct = charct = 0; if ((fd = open(argv[1], O_RDONLY, 0)) < 0) { perror("open"); exit(EXIT_FAILURE); } if (fstatfs(fd, &fsb)) { perror("fstatfs"); exit(EXIT_FAILURE); } buf = malloc(fsb.f_bsize); if (!buf) { perror("malloc"); exit(EXIT_FAILURE); } gotsp = 1;
  • 72. © 2019 Composewell Technologies Counting Words in C (The Logic) while ((len = read(fd, buf, fsb.f_bsize)) != 0) { if (len == -1) { perror("read"); exit(EXIT_FAILURE); } p = buf; … } while (len > 0) { uint8_t ch = *p; charct++; len -= 1; p += 1; if (ch == 'n') ++linect; if (isspace(ch)) gotsp = 1; else if (gotsp) { gotsp = 0; ++wordct; } } }
  • 73. © 2019 Composewell Technologies Counting Words in Haskell data WordCount = WordCount !Int !Bool data Counts = Counts !Int !Int !WordCount initialCounts = Counts 0 0 (WordCount 0 True) countl :: Int -> Word8 -> Int countl n ch = if (ch == 10) then n + 1 else n countw :: WordCount -> Word8 -> WordCount countw (WordCount n wasSpace) ch = if (isSpace $ chr $ fromIntegral ch) then WordCount n True else WordCount (if wasSpace then n + 1 else n) False {-# INLINE updateCounts #-} updateCounts :: Counts -> Word8 -> Counts updateCounts (Counts c l w) ch = Counts (c + 1) (countl l ch) (countw w ch) wc :: Handle -> IO Counts wc h = S.unfold FH.read h -- SerialT IO Char & S.foldl' updateCounts initialCounts -- IO Counts
  • 74. © 2019 Composewell Technologies Word Counting: C vs Haskell (550 MB text file) C Haskell 2.42 Second 2.17 Second
  • 75. © 2019 Composewell Technologies Can Haskell be as fast as C? • Each Haskell combinator represents a small piece of the loop • The programmer composes the loop using these combinators. • GHC fuses these pieces together (stream fusion) to create a monolithic loop like C. • Finally, the structure of the optimized code churned out by GHC is like the C code, with the both the loops as you see in the C program. • This is global program optimization, an efficient big picture is created using smaller pieces. • GCC can only perform low level optimization
  • 76. © 2019 Composewell Technologies How can GHC perform global optimizations? • Strong types and purity makes equational reasoning possible. • This allows GHC to reliably perform transformations over the code and fuse parts of the code to generate efficient code. • Global program optimization is not possible in C.
  • 77. © 2019 Composewell Technologies What are the downsides? • Stream fusion depends on inlining and SPEC constructor optimizations. • Often careful INLINE annotations are needed. • Higher order functions require INLINE in the “right” phase. • GHC may need more work to perform it fully reliably • Due to global optimization compilation is slower and may get slower as the size of the program increases.
  • 78. © 2019 Composewell Technologies Streamly GHC Plugin For Fusion • Internal join bindings beyond a size threshold are not inlined, blocking fusion. • We mark stream constructors with a pragma to identify them in the core. • Join points with such constructors are inlined irrespective of the size. • This allows more reliable fusion.
  • 79. © 2019 Composewell Technologies The Project
  • 80. © 2019 Composewell Technologies Current State of The Project • ~25K LOC, ~16K Doc, ~95 files, 18 contributors • High quality, tested, production capable • Some parts of the API may change in near future • type names may change • module structure may change
  • 81. © 2019 Composewell Technologies Work In Progress • Stream parsers • Concurrent folds • Splitting and merging transformations
  • 82. © 2019 Composewell Technologies Roadmap • Shared concurrent state • Persistent queues • Vector instructions • Distributed processing • Lot more stuff
  • 83. © 2019 Composewell Technologies Thank You! harendra@composewell.com twitter: @hk_hooda https://github.com/composewell/streamly https://gitter.im/composewell/streamly