Successfully reported this slideshow.
Your SlideShare is downloading. ×

Kafka JDBC Connect Guide(Postgres Sink).pptx

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 27 Ad

Kafka JDBC Connect Guide(Postgres Sink).pptx

Download to read offline

kafka JDBC connect install guide (w/ postgreSQL)
- stand-alone mode
- distributed mode
- schema registry
- java producer example

kafka JDBC connect install guide (w/ postgreSQL)
- stand-alone mode
- distributed mode
- schema registry
- java producer example

Advertisement
Advertisement

More Related Content

Recently uploaded (20)

Advertisement

Kafka JDBC Connect Guide(Postgres Sink).pptx

  1. 1. Stand-Alone Mode
  2. 2. Standalone • table create on postgresql database CREATE TABLE employee ( idx int, NAME CHARACTER VARYING(300) )
  3. 3. Topic • create a topic kafka-topics.sh --create --topic employee --bootstrap-server master:9092,slave1:9092,slave2:9092 –replication-factor 1 --partitions 1 • check created topic kafka-topics.sh --bootstrap-server localhost:9092 --list
  4. 4. execute schema registry • at Confluent Home directory sh bin/schema-registry-start etc/schema-registry/schema-registry.properties • Port Open firewall-cmd --permanent --zone=public --add-port=8081/tcp firewall-cmd --reload
  5. 5. download connector https://www.confluent.io/hub/confluentinc/kafka-connect-jdbc
  6. 6. unzip connector • unzip connector at {confluent home directory}/share/
  7. 7. configuration (1/2) • {confluent home directory}/etc/kafka/connect-standalone.properties bootstrap.servers=localhost:9092 key.converter=io.confluent.connect.avro.AvroConverter key.converter.schema.registry.url=http://localhost:8081 value.converter=io.confluent.connect.avro.AvroConverter value.converter.schema.registry.url=http://localhost:8081 key.converter.schemas.enable=true value.converter.schemas.enable=true offset.storage.file.filename=/tmp/connect.offsets offset.flush.interval.ms=10000 # Set to a list of filesystem paths separated by commas (,) to enable class loading isolation for plugins plugin.path=/root/confluent-7.3.0/share/confluentinc-kafka-connect-jdbc-10.6.0
  8. 8. configuration (2/2) • {confluent home directory}/etc/kafka/sink-postgres.properties name=employee-sink connector.class=io.confluent.connect.jdbc.JdbcSinkConnector tasks.max=1 topics={topic id eg) topic a, topic b, …} connection.url=jdbc:postgresql://{db ip}:{db port}/{db name} connection.user={db id} connection.password={db pwd} insert.mode=insert auto.create=true table.name.format={table name} pk.mode=none pk.fields=none
  9. 9. Execution • sh bin/connect-standalone etc/kafka/connect-standalone.properties etc/kafka/sink-postgres.properties [2022-12-08 07:42:54,056] INFO [employee-sink|task-0] [Consumer clientId=connector-consumer-employee-sink-0, groupId=connect-employee-sink] Resetting offset for partition employee-0 to position FetchPosition{offset=0, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[slave2.kopo:9092 (id: 3 rack: null)], epoch=0}}. (org.apache.kafka.clients.consumer.internals.SubscriptionState:399) [2022-12-08 07:47:56,035] INFO [employee-sink|task-0] Attempting to open connection #1 to PostgreSql (io.confluent.connect.jdbc.util.CachedConnectionProvider:79) [2022-12-08 07:47:56,456] INFO [employee-sink|task-0] Maximum table name length for database is 63 bytes (io.confluent.connect.jdbc.dialect.PostgreSqlDatabaseDialect:130) [2022-12-08 07:47:56,456] INFO [employee-sink|task-0] JdbcDbWriter Connected (io.confluent.connect.jdbc.sink.JdbcDbWriter:56) [2022-12-08 07:47:56,531] INFO [employee-sink|task-0] Checking PostgreSql dialect for existence of TABLE "employee" (io.confluent.connect.jdbc.dialect.GenericDatabaseDialect:586) [2022-12-08 07:47:56,548] INFO [employee-sink|task-0] Using PostgreSql dialect TABLE "employee" present (io.confluent.connect.jdbc.dialect.GenericDatabaseDialect:594) [2022-12-08 07:47:56,578] INFO [employee-sink|task-0] Checking PostgreSql dialect for type of TABLE "employee" (io.confluent.connect.jdbc.dialect.GenericDatabaseDialect:880) [2022-12-08 07:47:56,589] INFO [employee-sink|task-0] Setting metadata for table "employee" to Table{name='"employee"', type=TABLE columns=[Column{'idx', isPrimaryKey=false, allowsNull=true, sqlType=int4}, Column{'name', isPrimaryKey=false, allowsNull=true, sqlType=varchar}]} (io.confluent.connect.jdbc.util.TableDefinitions:64) [2022-12-08 07:51:53,759] INFO [employee-sink|task-0] [Consumer clientId=connector-consumer-employee-sink-0, groupId=connect-employee-sink] Node -1 disconnected. (org.apache.kafka.clients.NetworkClient:937)
  10. 10. Test • consumer shell ({confluent home dir}/bin) ./kafka-avro-console-consumer --bootstrap-server localhost:9092 --topic employee • Producer shell ({confluent home dir}/bin) ./kafka-avro-console-producer --broker-list localhost:9092 --topic employee --property value.schema='{"type":"record","name":"kafka_employee","fields":[{"name":"idx","type":"in t"},{"name":"name","type":"string"}]}’ {"idx":1,"name":"hwang"} {"idx":2,"name":"kim"} DB
  11. 11. Distributed Mode
  12. 12. configuration • etc/kafka/connect-distributed.properties bootstrap.servers=master:9092,slave1:9091,slave2:9092 group.id=connect-cluster key.converter=io.confluent.connect.avro.AvroConverter key.converter.schema.registry.url=http://localhost:8081 value.converter=io.confluent.connect.avro.AvroConverter value.converter.schema.registry.url=http://localhost:8081 key.converter.schemas.enable=false value.converter.schemas.enable=false offset.storage.topic=connect-offsets offset.storage.replication.factor=1 config.storage.topic=connect-configs config.storage.replication.factor=1 status.storage.topic=connect-status status.storage.replication.factor=1 offset.flush.interval.ms=10000 plugin.path=/root/confluent-7.3.0/share/confluentinc-kafka-connect-jdbc-10.6.0
  13. 13. Execution • sh bin/connect-distributed etc/kafka/connect-distributed.properties [2022-12-08 08:09:53,943] INFO REST resources initialized; server is started and ready to handle requests (org.apache.kafka.connect.runtime.rest.RestServer:312) [2022-12-08 08:09:53,944] INFO Kafka Connect started (org.apache.kafka.connect.runtime.Connect:56) [2022-12-08 08:09:54,419] INFO [sink-jdbc-postgre|task-0] Attempting to open connection #1 to PostgreSql (io.confluent.connect.jdbc.util.CachedConnectionProvider:79) [2022-12-08 08:09:55,020] INFO [sink-jdbc-postgre|task-0] Maximum table name length for database is 63 bytes (io.confluent.connect.jdbc.dialect.PostgreSqlDatabaseDialect:130) [2022-12-08 08:09:55,020] INFO [sink-jdbc-postgre|task-0] JdbcDbWriter Connected (io.confluent.connect.jdbc.sink.JdbcDbWriter:56) [2022-12-08 08:09:55,142] INFO [sink-jdbc-postgre|task-0] Checking PostgreSql dialect for existence of TABLE "employee" (io.confluent.connect.jdbc.dialect.GenericDatabaseDialect:586) [2022-12-08 08:09:55,168] INFO [sink-jdbc-postgre|task-0] Using PostgreSql dialect TABLE "employee" present (io.confluent.connect.jdbc.dialect.GenericDatabaseDialect:594) [2022-12-08 08:09:55,201] INFO [sink-jdbc-postgre|task-0] Checking PostgreSql dialect for type of TABLE "employee" (io.confluent.connect.jdbc.dialect.GenericDatabaseDialect:880) [2022-12-08 08:09:55,216] INFO [sink-jdbc-postgre|task-0] Setting metadata for table "employee" to Table{name='"employee"', type=TABLE columns=[Column{'idx', isPrimaryKey=false, allowsNull=true, sqlType=int4}, Column{'name', isPrimaryKey=false, allowsNull=true, sqlType=varchar}]} (io.confluent.connect.jdbc.util.TableDefinitions:64)
  14. 14. check a registered connector • curl --location --request GET 'localhost:8083/connector-plugins' [{"class":"io.confluent.connect.jdbc.JdbcSinkConnector","type":"sink","version":"10.6.0"},{"class":"io.confluent.c onnect.jdbc.JdbcSourceConnector","type":"source","version":"10.6.0"},{"class":"org.apache.kafka.connect.mirror. MirrorCheckpointConnector","type":"source","version":"7.3.0- ccs"},{"class":"org.apache.kafka.connect.mirror.MirrorHeartbeatConnector","type":"source","version":"7.3.0- ccs"},{"class":"org.apache.kafka.connect.mirror.MirrorSourceConnector","type":"source","version":"7.3.0-ccs"}]
  15. 15. Register a JDBC Sink curl --request POST 'localhost:8083/connectors' --header 'Content-Type: application/json' --data '{ "name": "sink-jdbc-postgre", "config": { "connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector", "tasks.max": "1", "topics": “{topic id}", "connection.url":"jdbc:postgresql://{db ip}:{db port}/{db name}", "connection.user":“{db id}", "connection.password":“{db pwd}", "auto.create":"true", "table.name.format":“{table name}", "insert.mode":"insert", "auto.create":"true", "pk.mode":"none", "pk.fields":"none" } }'
  16. 16. query a connector • curl -s localhost:8083/connectors/sink-jdbc-postgre {"name":"sink-jdbc-postgre","config":{"connector.class":"io.confluent.connect.jdbc.JdbcSinkConnector","table.name.format":"employee","connection.password":“{db pwd}","tasks.max":"1","topics":"employee","connection.user":“{db id}","name":"sink-jdbc-postgre","auto.create":"true","connection.url":"jdbc:postgresql://{db ip}:{dp port}/{db name}","insert.mode":"insert","pk.mode":"none","pk.fields":"none"},"tasks":[{"connector":"sink-jdbc-postgre","task":0}],"type":"sink"} [FYI] connector delete command curl -X DELETE "http://localhost:8083/connectors/sink-jdbc-postgre"
  17. 17. check a schema registry • curl localhost:8081/subjects • curl localhost:8081/subjects/employee-value/versions • curl localhost:8081/subjects/employee-value/versions/1
  18. 18. Test (same as stand-alone mode) • consumer shell ({confluent home dir}/bin) ./kafka-avro-console-consumer --bootstrap-server localhost:9092 --topic employee • Producer shell ({confluent home dir}/bin) ./kafka-avro-console-producer --broker-list localhost:9092 --topic employee --property value.schema='{"type":"record","name":"kafka_employee","fields":[{"name":"idx","type":"in t"},{"name":"name","type":"string"}]}’ {"idx":1,"name":"hwang"} {"idx":2,"name":"kim"} DB
  19. 19. Java Producer
  20. 20. Step1 – Maven Installation # wget https://downloads.apache.org/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.tar.gz -P /tmp # tar -xzvf /tmp/apache-maven-3.6.3-bin.tar.gz -C /opt # ln -s /opt/apache-maven-3.6.3 /opt/maven # vi /etc/profile.d/maven.sh # source /etc/profile.d/maven.sh # mvn --version export M2_HOME=/opt/maven export MAVEN_HOME=/opt/maven export PATH=${M2_HOME}/bin:${PATH}
  21. 21. Step2 – pom.xml <?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.kopo.kafka</groupId> <artifactId>kafka-example</artifactId> <version>1.0</version> <repositories> <repository> <id>confluent</id> <url>http://packages.confluent.io/maven/</url> </repository> </repositories> <properties> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> <maven.compiler.source>1.8</maven.compiler.source> <maven.compiler.target>1.8</maven.compiler.target> <confluent.version>7.3.0</confluent.version> </properties> <dependencies> <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka_2.13</artifactId> <version>3.3.1</version> </dependency> <dependency> <groupId>io.confluent</groupId> <artifactId>kafka-avro-serializer</artifactId> <version>${confluent.version}</version> </dependency> <dependency> <groupId>io.confluent</groupId> <artifactId>common-config</artifactId> <version>${confluent.version}</version> </dependency> <dependency> <groupId>org.apache.avro</groupId> <artifactId>avro</artifactId> <version>1.11.1</version> </dependency> <dependency> <groupId>com.google.guava</groupId> <artifactId>guava</artifactId> <version>31.0-jre</version> </dependency> <dependency> <groupId>org.slf4j</groupId> <artifactId>slf4j-api</artifactId> <version>1.7.5</version> </dependency> <dependency> <groupId>org.slf4j</groupId> <artifactId>slf4j-simple</artifactId> <version>1.6.4</version> </dependency> </dependencies> </project> mkdir ex_connect cd ex_connect vi pom.xml
  22. 22. Step3 – Implementation mkdir -p src/main/java/com/kopo/kafka vi src/main/java/com/kopo/kafka/ConnectProducer.java package com.kopo.kafka; import org.apache.avro.generic.GenericData; import org.apache.avro.generic.GenericRecord; import org.apache.kafka.clients.producer.KafkaProducer; import org.apache.kafka.clients.producer.ProducerConfig; import org.apache.kafka.clients.producer.ProducerRecord; import org.apache.kafka.common.serialization.StringSerializer; import io.confluent.kafka.serializers.KafkaAvroSerializer; import org.apache.avro.Schema; import java.util.Properties; import java.util.Random; import java.util.UUID; public class ConnectProducer { private final static String TOPIC_NAME = "employee"; private final static String BOOTSTRAP_SERVERS = "localhost:9092"; public static void main(String[] args) { Properties configs = new Properties(); configs.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, BOOTSTRAP_SERVERS); //configs.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName()); configs.setProperty("key.serializer", KafkaAvroSerializer.class.getName()); configs.setProperty("value.serializer", KafkaAvroSerializer.class.getName()); configs.setProperty("schema.registry.url", "http://192.168.56.30:8081"); String schema = "{" // + ""namespace": "myrecord"," + " "name": "kafka_employee"," + " "type": "record"," + " "fields": [" + " {"name": "idx", "type": "int"}," + " {"name": "name", "type": "string"}" + " ]" + "}"; Schema.Parser parser = new Schema.Parser(); Schema avroSchema1 = parser.parse(schema); // generate avro generic record GenericRecord avroRecord = new GenericData.Record(avroSchema1); avroRecord.put("idx", 100); avroRecord.put("name", "test-name"); KafkaProducer<String, GenericRecord> producer = new KafkaProducer<>(configs); ProducerRecord<String, GenericRecord> record = new ProducerRecord<>(TOPIC_NAME, avroRecord); producer.send(record); producer.flush(); producer.close(); } } schema registry running server
  23. 23. Step4 – Maven Install mvn install
  24. 24. java -cp target/kafka-example-1.0.jar:/root/.m2/repository/org/apache/kafka/kafka_2.13/3.3.1/kafka_2.13- 3.3.1.jar:/root/.m2/repository/org/apache/kafka/kafka-clients/3.3.1/kafka-clients- 3.3.1.jar:/root/.m2/repository/org/apache/avro/avro/1.11.1/avro- 1.11.1.jar:/root/.m2/repository/io/confluent/kafka-avro-serializer/7.3.0/kafka-avro-serializer- 7.3.0.jar:/root/.m2/repository/org/slf4j/slf4j-api/1.7.5/slf4j-api-1.7.5.jar:/root/.m2/repository/org/slf4j/slf4j- simple/1.6.4/slf4j-simple-1.6.4.jar:/root/.m2/repository/io/confluent/kafka-schema-serializer/7.3.0/kafka- schema-serializer-7.3.0.jar:/root/.m2/repository/com/fasterxml/jackson/core/jackson-databind/2.13.3/jackson- databind-2.13.3.jar:/root/.m2/repository/com/fasterxml/jackson/core/jackson-core/2.13.3/jackson-core- 2.13.3.jar:/root/.m2/repository/com/fasterxml/jackson/core/jackson-annotations/2.13.3/jackson-annotations- 2.13.3.jar:/root/.m2/repository/io/confluent/kafka-schema-registry-client/7.3.0/kafka-schema-registry-client- 7.3.0.jar:/root/.m2/repository/com/google/guava/guava/31.0-jre/guava-31.0- jre.jar:/root/.m2/repository/com/google/guava/failureaccess/1.0.1/failureaccess-1.0.1.jar com.kopo.kafka.ConnectProducer Step5 – Execution Jar
  25. 25. Test (same as stand-alone mode) • consumer shell ({confluent home dir}/bin) ./kafka-avro-console-consumer --bootstrap-server localhost:9092 --topic employee DB producer (Java) consumer avroRecord.put("name", "test-name"); DB
  26. 26. Reference • https://mvnrepository.com/artifact/org.apache.avro/avro/1.11.1 • https://mvnrepository.com/artifact/com.google.guava/guava/31.0-jre • https://mvnrepository.com/artifact/io.confluent/kafka-avro-serializer/7.3.0 • https://www.confluent.io/hub/confluentinc/kafka-connect-jdbc • https://github.com/datastax/kafka-examples/blob/master/producers/pom.xml • https://itnext.io/howto-produce-avro-messages-to-kafka-ec0b770e1f54 • https://docs.confluent.io/platform/current/schema-registry/connect.html#protobuf • https://www.youtube.com/watch?v=ZTVBfskFwYY • https://wecandev.tistory.com/110 • https://docs.confluent.io/kafka-connectors/jdbc/current/sink-connector/overview.html#suggested-reading • https://docs.confluent.io/kafka-connectors/self-managed/userguide.html#worker-configuration-properties-file • https://wecandev.tistory.com/110

×