SlideShare a Scribd company logo
val inputStream = s.readStream
.format("kinesis")
.option("streamName", "stream-name") // Kinesis Stream Name
.option("region", "ap-northeast-1") // Kinesis Region
.option("initialPosition", "TRIM_HORIZON") // Kinesis initial position
.load()
case class SampleSchema(column1: String, column2: Int)
val testSchema = ScalaReflection.schemaFor[SampleSchema].dataType.asInstanceOf[StructType]
val structuredStream = inputStream
.select(from_json(col("data").cast("string"), testSchema).as("data")) // Parse JSON and convert to testSchema
.select("data.*") // select JSON columns
val kclSource = KCLSource("scalamatsuri")
val producer = Flow[UserLog]
.map(_.toJson.compactPrint)
.map(ByteString.fromString(_))
.map(KinesisStreamObject(UUID.randomUUID().toString, _))
.toMat(KinesisProducer("scalamatsuri2", KinesisProducerSetting.fromShardCount(1)))(Keep.right)
val extract = Flow.fromFunction[ByteString, PurchaseLog](_.utf8String.parseJson.convertTo[PurchaseLog])
val convert = Flow[PurchaseLog].mapConcat { log =>
log.items.map { item =>
UserLog(
log.userId,
// properties...
)
}
}
val result =
kclSource
.via(extract)
.via(convert)
.toMat(producer)(Keep.right)
.run()
val stream = inputStream
.select(from_json(col("data").cast("string"), purchaseLogSchema).as("data"))
.select("data.*")
.select(
$"user_id",
$"time",
expr("explode(items)").as("item")
)
.select(
$"user_id",
$"time",
lit("PURCHASE").as("log_type"),
$"item.item_id".as("item_id"),
$"item.price".as("price"),
$"item.quantity".as("quantity"),
unix_timestamp().as("created_time")
)
.as[UserLog] // case class UserLog(
stream
.writeStream
.foreach(sink)
.outputMode("update")
.start()
val http = Http()
val separator = ByteString.fromString(Properties.lineSeparator)
val separatorBytes = separator.length
val MAX_FILE_SIZE = 10 * 1024 * 1024
val MAX_BYTES_PER_HOUR = 100 * 1024 * 1024
val kclSource = KCLSource("scalamatsuri")
val buffering = Flow[ByteString]
.batchWeighted[ByteString](MAX_FILE_SIZE, _.length, seed => seed)(_ ++ separator ++ _)
.throttle(MAX_BYTES_PER_HOUR, Duration(1, HOURS), _.length)
val send: Flow[ByteString, ByteString] = ??? //
val logging = Sink.foreach[ByteString] { body =>
logger.info(body.utf8String)
}
val request = kclSource
.via(buffering)
.via(send)
.toMat(logging)(Keep.right)
.run()
val inputStream = spark.readStream
.format("kinesis")
.option("streamName", scalamatsuriStraem)
.option("region", "ap-northeast-1")
.option("initialPosition", "TRIM_HORIZON")
.option("awsAccessKey", awsAccessKeyId)
.option("awsSecretKey", awsSecretKey)
.option("maxRecordsPerFetch", 100000)
.option("fetchBufferSize", 100000)
.option("maxFetchDuration", 1.0)
.load()
val strteamWithWM = stream1.as("in1")
.withColumn("dt", to_timestamp(from_unixtime(col("time"))))
.withWatermark("dt", "3 hours")
val stream2WithWM = stream2.as("in2")
.withColumn("dt", to_timestamp(from_unixtime(col("time"))))
.withWatermark("dt", "1 hours")
strteamWithWM.join(
stream2WithWM,
expr("""
in1.user_id = in2.user_id
AND
in1.time < in2.time
AND
in1.time >= in2.time + interval 3 hour
"""),
"leftOuter"
)
Akka streams vs spark structured streaming
Akka streams vs spark structured streaming
Akka streams vs spark structured streaming

More Related Content

What's hot

Djangoによるスマホアプリバックエンドの実装
Djangoによるスマホアプリバックエンドの実装Djangoによるスマホアプリバックエンドの実装
Djangoによるスマホアプリバックエンドの実装
Nakazawa Yuichi
 
[2019] 바르게, 빠르게! Reactive를 품은 Spring Kafka
[2019] 바르게, 빠르게! Reactive를 품은 Spring Kafka[2019] 바르게, 빠르게! Reactive를 품은 Spring Kafka
[2019] 바르게, 빠르게! Reactive를 품은 Spring Kafka
NHN FORWARD
 
Migrating Financial and Accounting Systems from Oracle to Amazon DynamoDB (DA...
Migrating Financial and Accounting Systems from Oracle to Amazon DynamoDB (DA...Migrating Financial and Accounting Systems from Oracle to Amazon DynamoDB (DA...
Migrating Financial and Accounting Systems from Oracle to Amazon DynamoDB (DA...
Amazon Web Services
 
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
SANG WON PARK
 
LMAX Architecture
LMAX ArchitectureLMAX Architecture
LMAX Architecture
Stephan Schmidt
 
Deep Dive into Apache Kafka
Deep Dive into Apache KafkaDeep Dive into Apache Kafka
Deep Dive into Apache Kafka
confluent
 
Keycloak拡張入門
Keycloak拡張入門Keycloak拡張入門
Keycloak拡張入門
Hiroyuki Wada
 
EtherCATやPROFINETを OPC UAで接続してみた
EtherCATやPROFINETを OPC UAで接続してみたEtherCATやPROFINETを OPC UAで接続してみた
EtherCATやPROFINETを OPC UAで接続してみた
ミソジ
 
より高品質なメディアサービスを目指す ABEMA の技術進化
より高品質なメディアサービスを目指す ABEMA の技術進化より高品質なメディアサービスを目指す ABEMA の技術進化
より高品質なメディアサービスを目指す ABEMA の技術進化
Yusuke Goto
 
KeycloakでAPI認可に入門する
KeycloakでAPI認可に入門するKeycloakでAPI認可に入門する
KeycloakでAPI認可に入門する
Hitachi, Ltd. OSS Solution Center.
 
Flask With Server-Sent Event
Flask With Server-Sent EventFlask With Server-Sent Event
Flask With Server-Sent Event
Tencent
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
Aparna Pillai
 
在學校開機場把自己的學費賺回來!.pptx
在學校開機場把自己的學費賺回來!.pptx在學校開機場把自己的學費賺回來!.pptx
在學校開機場把自己的學費賺回來!.pptx
HsiangMingHung
 
Php Performance On Windows
Php Performance On WindowsPhp Performance On Windows
Php Performance On Windows
ruslany
 
Hadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GC
Hadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GCHadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GC
Hadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GC
Erik Krogen
 
UserParameter vs Zabbix Sender - 1º ZABBIX MEETUP DO INTERIOR-SP
UserParameter vs Zabbix Sender - 1º ZABBIX MEETUP DO INTERIOR-SPUserParameter vs Zabbix Sender - 1º ZABBIX MEETUP DO INTERIOR-SP
UserParameter vs Zabbix Sender - 1º ZABBIX MEETUP DO INTERIOR-SP
André Déo
 
OTRS紹介資料
OTRS紹介資料OTRS紹介資料
OTRS紹介資料
IO Architect Inc.
 
Trends_of_MLOps_tech_in_business
Trends_of_MLOps_tech_in_businessTrends_of_MLOps_tech_in_business
Trends_of_MLOps_tech_in_business
SANG WON PARK
 
ネットワーク自動化の課題 - グラフデータベースによる解決
ネットワーク自動化の課題 - グラフデータベースによる解決ネットワーク自動化の課題 - グラフデータベースによる解決
ネットワーク自動化の課題 - グラフデータベースによる解決
ApstraJapan
 
[211] HBase 기반 검색 데이터 저장소 (공개용)
[211] HBase 기반 검색 데이터 저장소 (공개용)[211] HBase 기반 검색 데이터 저장소 (공개용)
[211] HBase 기반 검색 데이터 저장소 (공개용)
NAVER D2
 

What's hot (20)

Djangoによるスマホアプリバックエンドの実装
Djangoによるスマホアプリバックエンドの実装Djangoによるスマホアプリバックエンドの実装
Djangoによるスマホアプリバックエンドの実装
 
[2019] 바르게, 빠르게! Reactive를 품은 Spring Kafka
[2019] 바르게, 빠르게! Reactive를 품은 Spring Kafka[2019] 바르게, 빠르게! Reactive를 품은 Spring Kafka
[2019] 바르게, 빠르게! Reactive를 품은 Spring Kafka
 
Migrating Financial and Accounting Systems from Oracle to Amazon DynamoDB (DA...
Migrating Financial and Accounting Systems from Oracle to Amazon DynamoDB (DA...Migrating Financial and Accounting Systems from Oracle to Amazon DynamoDB (DA...
Migrating Financial and Accounting Systems from Oracle to Amazon DynamoDB (DA...
 
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
 
LMAX Architecture
LMAX ArchitectureLMAX Architecture
LMAX Architecture
 
Deep Dive into Apache Kafka
Deep Dive into Apache KafkaDeep Dive into Apache Kafka
Deep Dive into Apache Kafka
 
Keycloak拡張入門
Keycloak拡張入門Keycloak拡張入門
Keycloak拡張入門
 
EtherCATやPROFINETを OPC UAで接続してみた
EtherCATやPROFINETを OPC UAで接続してみたEtherCATやPROFINETを OPC UAで接続してみた
EtherCATやPROFINETを OPC UAで接続してみた
 
より高品質なメディアサービスを目指す ABEMA の技術進化
より高品質なメディアサービスを目指す ABEMA の技術進化より高品質なメディアサービスを目指す ABEMA の技術進化
より高品質なメディアサービスを目指す ABEMA の技術進化
 
KeycloakでAPI認可に入門する
KeycloakでAPI認可に入門するKeycloakでAPI認可に入門する
KeycloakでAPI認可に入門する
 
Flask With Server-Sent Event
Flask With Server-Sent EventFlask With Server-Sent Event
Flask With Server-Sent Event
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
在學校開機場把自己的學費賺回來!.pptx
在學校開機場把自己的學費賺回來!.pptx在學校開機場把自己的學費賺回來!.pptx
在學校開機場把自己的學費賺回來!.pptx
 
Php Performance On Windows
Php Performance On WindowsPhp Performance On Windows
Php Performance On Windows
 
Hadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GC
Hadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GCHadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GC
Hadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GC
 
UserParameter vs Zabbix Sender - 1º ZABBIX MEETUP DO INTERIOR-SP
UserParameter vs Zabbix Sender - 1º ZABBIX MEETUP DO INTERIOR-SPUserParameter vs Zabbix Sender - 1º ZABBIX MEETUP DO INTERIOR-SP
UserParameter vs Zabbix Sender - 1º ZABBIX MEETUP DO INTERIOR-SP
 
OTRS紹介資料
OTRS紹介資料OTRS紹介資料
OTRS紹介資料
 
Trends_of_MLOps_tech_in_business
Trends_of_MLOps_tech_in_businessTrends_of_MLOps_tech_in_business
Trends_of_MLOps_tech_in_business
 
ネットワーク自動化の課題 - グラフデータベースによる解決
ネットワーク自動化の課題 - グラフデータベースによる解決ネットワーク自動化の課題 - グラフデータベースによる解決
ネットワーク自動化の課題 - グラフデータベースによる解決
 
[211] HBase 기반 검색 데이터 저장소 (공개용)
[211] HBase 기반 검색 데이터 저장소 (공개용)[211] HBase 기반 검색 데이터 저장소 (공개용)
[211] HBase 기반 검색 데이터 저장소 (공개용)
 

Similar to Akka streams vs spark structured streaming

Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...
Databricks
 
Using spark data frame for sql
Using spark data frame for sqlUsing spark data frame for sql
Using spark data frame for sql
DaeMyung Kang
 
Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...
Databricks
 
Java and XML Schema
Java and XML SchemaJava and XML Schema
Java and XML Schema
Raji Ghawi
 
Scala in Places API
Scala in Places APIScala in Places API
Scala in Places API
Łukasz Bałamut
 
Making Structured Streaming Ready for Production
Making Structured Streaming Ready for ProductionMaking Structured Streaming Ready for Production
Making Structured Streaming Ready for Production
Databricks
 
import java-util--- import java-io--- class Vertex { -- Constructo.docx
import java-util--- import java-io---   class Vertex {   -- Constructo.docximport java-util--- import java-io---   class Vertex {   -- Constructo.docx
import java-util--- import java-io--- class Vertex { -- Constructo.docx
Blake0FxCampbelld
 
05 Geographic scripting in uDig - halfway between user and developer
05 Geographic scripting in uDig - halfway between user and developer05 Geographic scripting in uDig - halfway between user and developer
05 Geographic scripting in uDig - halfway between user and developer
Andrea Antonello
 
AWS CloudFormation Session
AWS CloudFormation SessionAWS CloudFormation Session
AWS CloudFormation Session
Kamal Maiti
 
[2019-07] GraphQL in depth (serverside)
[2019-07] GraphQL in depth (serverside)[2019-07] GraphQL in depth (serverside)
[2019-07] GraphQL in depth (serverside)
croquiscom
 
Stream analysis with kafka native way and considerations about monitoring as ...
Stream analysis with kafka native way and considerations about monitoring as ...Stream analysis with kafka native way and considerations about monitoring as ...
Stream analysis with kafka native way and considerations about monitoring as ...
Andrew Yongjoon Kong
 
Jackson beyond JSON: XML, CSV
Jackson beyond JSON: XML, CSVJackson beyond JSON: XML, CSV
Jackson beyond JSON: XML, CSV
Tatu Saloranta
 
3 database-jdbc(1)
3 database-jdbc(1)3 database-jdbc(1)
3 database-jdbc(1)
hameedkhan2017
 
So various polymorphism in Scala
So various polymorphism in ScalaSo various polymorphism in Scala
So various polymorphism in Scala
b0ris_1
 
What is new in Java 8
What is new in Java 8What is new in Java 8
What is new in Java 8
Sandeep Kr. Singh
 
Java7 New Features and Code Examples
Java7 New Features and Code ExamplesJava7 New Features and Code Examples
Java7 New Features and Code Examples
Naresh Chintalcheru
 
Rule Your Geometry with the Terraformer Toolkit
Rule Your Geometry with the Terraformer ToolkitRule Your Geometry with the Terraformer Toolkit
Rule Your Geometry with the Terraformer ToolkitAaron Parecki
 
(BDT401) Big Data Orchestra - Harmony within Data Analysis Tools | AWS re:Inv...
(BDT401) Big Data Orchestra - Harmony within Data Analysis Tools | AWS re:Inv...(BDT401) Big Data Orchestra - Harmony within Data Analysis Tools | AWS re:Inv...
(BDT401) Big Data Orchestra - Harmony within Data Analysis Tools | AWS re:Inv...
Amazon Web Services
 

Similar to Akka streams vs spark structured streaming (20)

Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...
 
Using spark data frame for sql
Using spark data frame for sqlUsing spark data frame for sql
Using spark data frame for sql
 
Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...
 
Java and XML Schema
Java and XML SchemaJava and XML Schema
Java and XML Schema
 
Scala in Places API
Scala in Places APIScala in Places API
Scala in Places API
 
Making Structured Streaming Ready for Production
Making Structured Streaming Ready for ProductionMaking Structured Streaming Ready for Production
Making Structured Streaming Ready for Production
 
import java-util--- import java-io--- class Vertex { -- Constructo.docx
import java-util--- import java-io---   class Vertex {   -- Constructo.docximport java-util--- import java-io---   class Vertex {   -- Constructo.docx
import java-util--- import java-io--- class Vertex { -- Constructo.docx
 
05 Geographic scripting in uDig - halfway between user and developer
05 Geographic scripting in uDig - halfway between user and developer05 Geographic scripting in uDig - halfway between user and developer
05 Geographic scripting in uDig - halfway between user and developer
 
Aimaf
AimafAimaf
Aimaf
 
AWS CloudFormation Session
AWS CloudFormation SessionAWS CloudFormation Session
AWS CloudFormation Session
 
[2019-07] GraphQL in depth (serverside)
[2019-07] GraphQL in depth (serverside)[2019-07] GraphQL in depth (serverside)
[2019-07] GraphQL in depth (serverside)
 
Stream analysis with kafka native way and considerations about monitoring as ...
Stream analysis with kafka native way and considerations about monitoring as ...Stream analysis with kafka native way and considerations about monitoring as ...
Stream analysis with kafka native way and considerations about monitoring as ...
 
Introduction to Scala
Introduction to ScalaIntroduction to Scala
Introduction to Scala
 
Jackson beyond JSON: XML, CSV
Jackson beyond JSON: XML, CSVJackson beyond JSON: XML, CSV
Jackson beyond JSON: XML, CSV
 
3 database-jdbc(1)
3 database-jdbc(1)3 database-jdbc(1)
3 database-jdbc(1)
 
So various polymorphism in Scala
So various polymorphism in ScalaSo various polymorphism in Scala
So various polymorphism in Scala
 
What is new in Java 8
What is new in Java 8What is new in Java 8
What is new in Java 8
 
Java7 New Features and Code Examples
Java7 New Features and Code ExamplesJava7 New Features and Code Examples
Java7 New Features and Code Examples
 
Rule Your Geometry with the Terraformer Toolkit
Rule Your Geometry with the Terraformer ToolkitRule Your Geometry with the Terraformer Toolkit
Rule Your Geometry with the Terraformer Toolkit
 
(BDT401) Big Data Orchestra - Harmony within Data Analysis Tools | AWS re:Inv...
(BDT401) Big Data Orchestra - Harmony within Data Analysis Tools | AWS re:Inv...(BDT401) Big Data Orchestra - Harmony within Data Analysis Tools | AWS re:Inv...
(BDT401) Big Data Orchestra - Harmony within Data Analysis Tools | AWS re:Inv...
 

Recently uploaded

一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
2023240532
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 

Recently uploaded (20)

一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 

Akka streams vs spark structured streaming

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26. val inputStream = s.readStream .format("kinesis") .option("streamName", "stream-name") // Kinesis Stream Name .option("region", "ap-northeast-1") // Kinesis Region .option("initialPosition", "TRIM_HORIZON") // Kinesis initial position .load() case class SampleSchema(column1: String, column2: Int) val testSchema = ScalaReflection.schemaFor[SampleSchema].dataType.asInstanceOf[StructType] val structuredStream = inputStream .select(from_json(col("data").cast("string"), testSchema).as("data")) // Parse JSON and convert to testSchema .select("data.*") // select JSON columns
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 58. val kclSource = KCLSource("scalamatsuri") val producer = Flow[UserLog] .map(_.toJson.compactPrint) .map(ByteString.fromString(_)) .map(KinesisStreamObject(UUID.randomUUID().toString, _)) .toMat(KinesisProducer("scalamatsuri2", KinesisProducerSetting.fromShardCount(1)))(Keep.right) val extract = Flow.fromFunction[ByteString, PurchaseLog](_.utf8String.parseJson.convertTo[PurchaseLog]) val convert = Flow[PurchaseLog].mapConcat { log => log.items.map { item => UserLog( log.userId, // properties... ) } } val result = kclSource .via(extract) .via(convert) .toMat(producer)(Keep.right) .run()
  • 59.
  • 60. val stream = inputStream .select(from_json(col("data").cast("string"), purchaseLogSchema).as("data")) .select("data.*") .select( $"user_id", $"time", expr("explode(items)").as("item") ) .select( $"user_id", $"time", lit("PURCHASE").as("log_type"), $"item.item_id".as("item_id"), $"item.price".as("price"), $"item.quantity".as("quantity"), unix_timestamp().as("created_time") ) .as[UserLog] // case class UserLog( stream .writeStream .foreach(sink) .outputMode("update") .start()
  • 61.
  • 62.
  • 63. val http = Http() val separator = ByteString.fromString(Properties.lineSeparator) val separatorBytes = separator.length val MAX_FILE_SIZE = 10 * 1024 * 1024 val MAX_BYTES_PER_HOUR = 100 * 1024 * 1024 val kclSource = KCLSource("scalamatsuri") val buffering = Flow[ByteString] .batchWeighted[ByteString](MAX_FILE_SIZE, _.length, seed => seed)(_ ++ separator ++ _) .throttle(MAX_BYTES_PER_HOUR, Duration(1, HOURS), _.length) val send: Flow[ByteString, ByteString] = ??? // val logging = Sink.foreach[ByteString] { body => logger.info(body.utf8String) } val request = kclSource .via(buffering) .via(send) .toMat(logging)(Keep.right) .run()
  • 64. val inputStream = spark.readStream .format("kinesis") .option("streamName", scalamatsuriStraem) .option("region", "ap-northeast-1") .option("initialPosition", "TRIM_HORIZON") .option("awsAccessKey", awsAccessKeyId) .option("awsSecretKey", awsSecretKey) .option("maxRecordsPerFetch", 100000) .option("fetchBufferSize", 100000) .option("maxFetchDuration", 1.0) .load()
  • 65.
  • 66.
  • 67.
  • 68.
  • 69. val strteamWithWM = stream1.as("in1") .withColumn("dt", to_timestamp(from_unixtime(col("time")))) .withWatermark("dt", "3 hours") val stream2WithWM = stream2.as("in2") .withColumn("dt", to_timestamp(from_unixtime(col("time")))) .withWatermark("dt", "1 hours") strteamWithWM.join( stream2WithWM, expr(""" in1.user_id = in2.user_id AND in1.time < in2.time AND in1.time >= in2.time + interval 3 hour """), "leftOuter" )