Alpakkaで
Cloud PubSubを
ストリーム処理する
Fringe81社内Scala勉強会 2018.07.19
Alpakka
AkkaStream Connector for integration!
AMQP, MQTT, SQS, SNS, Kinesis, Cloud PubSub, Azure
Event Hub, Firebase cloud messaging, S3, Kafka, MongoDB,
HBase, Slick, HTTP, TCP, etc,...
Alpakka
AkkaStream Connector for integration!
AMQP, MQTT, SQS, SNS, Kinesis, Cloud PubSub, Azure
Event Hub, Firebase cloud messaging, S3, Kafka, MongoDB,
HBase, Slick, HTTP, TCP, etc,...
簡単にAkka Stream復習
主要モジュールは3つ
Source Flow Sink
これらを組み合わせる
(1 to 3)
.map{ i => println(s"A: $i"); i }
.map{ i => println(s"B: $i"); i }
.map{ i => println(s"C: $i"); i }
A:1
A:2
A:3
B:1
B:2
B:3
C:1
C:2
C:3
普通のScalaのコレクション
Source(1 to 3)
.map{ i => println(s"A: $i"); i }
.map{ i => println(s"B: $i"); i }
.map{ i => println(s"C: $i"); i }
.runWith(Sink.ignore)
A:1
A:2
B:1
A:3
B:2
C:1
B:3
C:2
C:3
Akka Stream
その他、
排圧制御など色々あるけど今回は割愛
Alpakka
AkkaStream Connector for integration!
AMQP, MQTT, SQS, SNS, Kinesis, Cloud PubSub, Azure
Event Hub, Firebase cloud messaging, S3, Kafka, MongoDB,
HBase, Slick, HTTP, TCP, etc,...
Alpakkaいく前に
先にGoogleから提供されている
Cloud PubSubのJavaAPIを確認してみる
〜Publish編〜
val msg = ByteString.copyFromUtf8(jsonStr)
val pubSubMsg =
PubsubMessage.newBuilder.setData(msg).build()
val resultF: ApiFuture[String] =
publisher.publish(pubSubMsg)
〜Subscribe編〜
このインタフェースを実装する
public interface MessageReceiver {
void receiveMessage(
final PubsubMessage message,
final AckReplyConsumer consumer
);
}
〜Subscribe編〜
messageを処理してconsumer.ack()で終わり
public interface MessageReceiver {
void receiveMessage(
final PubsubMessage message,
final AckReplyConsumer consumer
);
}
public interface MessageReceiver {
void receiveMessage(
final PubsubMessage message,
final AckReplyConsumer consumer
);
}
〜Subscribe編〜
1件のmessageをどう処理するか、という処理モデル
https://github.com/openmessaging/specification/blob/master/usecase.md より
一方のAlpakkaによるStream処理
https://github.com/openmessaging/specification/blob/master/usecase.md より
ではさっそく実装を
〜Subscriber編〜
val subscriptionSource: Source[ReceivedMessage, NotUsed] =
GooglePubSub.subscribe(projectId, apiKey, email, privateKey, subscription)
val ackSink: Sink[AcknowledgeRequest, Future[Done]] =
GooglePubSub.acknowledge(projectId, apiKey, email, privateKey, subscription)
subscriber用に提供されているのはSourceとSink
Soruceは最大1000件のMessageがくる
Sinkは入ってきたやつのackを返す
val subscriptionSource: Source[ReceivedMessage, NotUsed] =
GooglePubSub.subscribe(projectId, apiKey, email, privateKey, subscription)
val ackSink: Sink[AcknowledgeRequest, Future[Done]] =
GooglePubSub.acknowledge(projectId, apiKey, email, privateKey, subscription)
val doneF: Future[Done] =
subscriptionSource
.grouped(3)
.map(msgs => AcknowledgeRequest(msgs.map(_.ackId)))
.runWith(ackSink)
Flow部分はアプリケーション固有の処理を。
ここでは単純に3件ずつの固まりにしてただack返すだけ
〜Publisher編〜
val publishFlow: Flow[PublishRequest, Seq[String], NotUsed] =
GooglePubSub.publish(projectId, apiKey, clientEmail, privateKey, topic)
publisher用に提供されているのはFlow
PublishRequestをinputにpublish実行してくれ
その際、採番されたidがoutputとなる
val publishFlow: Flow[PublishRequest, Seq[String], NotUsed] =
GooglePubSub.publish(projectId, apiKey, clientEmail, privateKey, topic)
val source: Source[PublishRequest, NotUsed] = {
val msgs = (1 to 10).map { i =>
PubSubMessage(
messageId = i.toString,
data = new String(Base64.getEncoder.encode("Hello".getBytes))
)}
Source.single(PublishRequest(msgs))
}
val publishedMessageIds: Future[Done] =
source.via(publishFlow).runWith(Sink.ignore)
ところで、JavaAPIだと
一件一件しか取れないはずなのに
どうやってCloud Pub/Subから
カタマリで取ってるんだろう?
def pull(project: String, subscription: String, maybeAccessToken: Option[String], apiKey: String)(
implicit as: ActorSystem,
materializer: Materializer
): Future[PullResponse] = {
import materializer.executionContext
val uri: Uri = s"$PubSubGoogleApisHost/v1/projects/$project/subscriptions/$subscription:pull"
val request = HttpApi.PullRequest(returnImmediately = true, maxMessages = 1000)
PubSub用alpakkaライブラリの中身
Java API使ってると思いこんでたけど
Cloud PubSubのWEB API呼んでるだけだった
お わ り

Stream processing using alpakka cloud pub sub connector