Felpfe Inc.
Search
Close this search box.
call 24/7

+484 237-1364‬

Search
Close this search box.

Configuring topic properties and retention policies

Configuring topic properties and retention policies is essential for effectively managing data storage, retention periods, and message durability in Apache Kafka. By fine-tuning these settings, you can control the behavior of topics, ensure data retention compliance, and optimize storage utilization. In this article, we will explore the process of configuring topic properties and retention policies in Kafka. We will provide code samples, reference links, and resources to guide you through the configuration process.

Configuring Topic Properties:

  1. Topic-Level Configuration:
  • Kafka allows setting various topic-level properties to customize the behavior of individual topics. These properties include the number of partitions, replication factor, and cleanup policy.
  1. Number of Partitions:
  • Configuring the number of partitions in a topic determines the parallelism and scalability of message processing. Increasing the number of partitions allows for higher throughput and parallel consumption by multiple consumers.
  1. Replication Factor:
  • The replication factor determines the number of replicas for each partition. Configuring the replication factor ensures data durability and fault tolerance by replicating data across multiple brokers.

Code Sample: Creating a Topic with Configured Properties in Java

Java
import org.apache.kafka.clients.admin.AdminClient;
import org.apache.kafka.clients.admin.AdminClientConfig;
import org.apache.kafka.clients.admin.NewTopic;

import java.util.Properties;
import java.util.concurrent.ExecutionException;

public class KafkaTopicConfigurationExample {
    public static void main(String[] args) {
        String topicName = "my_topic";
        int numPartitions = 3;
        short replicationFactor = 2;

        Properties properties = new Properties();
        properties.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");

        try (AdminClient adminClient = AdminClient.create(properties)) {
            // Create the topic with configured properties
            NewTopic newTopic = new NewTopic(topicName, numPartitions, replicationFactor);
            adminClient.createTopics(Collections.singleton(newTopic)).all().get();
        } catch (InterruptedException | ExecutionException e) {
            e.printStackTrace();
        }
    }
}

Reference Link: Apache Kafka Documentation – Topic-Level Configurations – https://kafka.apache.org/documentation/#topicconfigs

Configuring Retention Policies:

  1. Log Compaction:
  • Kafka supports log compaction, which ensures that the latest key-value pair for each key is retained in the log. This is useful when maintaining a compacted log of changes for entities.
  1. Time-Based Retention:
  • Kafka allows setting a retention time for messages in a topic. Messages older than the specified time will be automatically deleted from the log.
  1. Size-Based Retention:
  • Kafka also supports size-based retention, where you can specify the maximum size of the log for a topic. Once the log size exceeds the configured threshold, older messages will be deleted.

Code Sample: Configuring Retention Policies for a Topic in Java

Java
import org.apache.kafka.clients.admin.AdminClient;
import org.apache.kafka.clients.admin.AdminClientConfig;
import org.apache.kafka.clients.admin.AlterConfigOp;
import org.apache.kafka.clients.admin.Config;
import org.apache.kafka.clients.admin.ConfigEntry;
import org.apache.kafka.clients.admin.ConfigEntry.AlterConfigOpType;
import org.apache.kafka.clients.admin.ConfigResource;
import org.apache.kafka.clients.admin.DescribeConfigsResult;
import org.apache.kafka.clients.admin.NewTopic;
import org.apache.kafka.common.config.TopicConfig;

import java.util.Collections;
import java.util.HashMap;
import java.util.Map;
import java.util.Properties;
import java.util.concurrent.ExecutionException;

public class KafkaTopicRetentionExample {
    public static void main(String[] args) {
        String topicName = "my_topic";
        long retentionTimeMs = 86400000;

 // 24 hours

        Properties properties = new Properties();
        properties.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");

        try (AdminClient adminClient = AdminClient.create(properties)) {
            // Describe the topic to retrieve its current configuration
            DescribeConfigsResult describeResult = adminClient.describeConfigs(Collections.singleton(new ConfigResource(ConfigResource.Type.TOPIC, topicName)));
            Config topicConfig = describeResult.all().get().get(new ConfigResource(ConfigResource.Type.TOPIC, topicName));

            // Update the retention time configuration
            Map<ConfigResource, Config> updateConfigs = new HashMap<>();
            ConfigEntry retentionEntry = new ConfigEntry(TopicConfig.RETENTION_MS_CONFIG, String.valueOf(retentionTimeMs), AlterConfigOpType.SET);
            Config updatedConfig = new Config(Collections.singleton(retentionEntry));
            updateConfigs.put(new ConfigResource(ConfigResource.Type.TOPIC, topicName), updatedConfig);
            adminClient.incrementalAlterConfigs(updateConfigs).all().get();
        } catch (InterruptedException | ExecutionException e) {
            e.printStackTrace();
        }
    }
}

Reference Link: Apache Kafka Documentation – Log Compaction and Retention – https://kafka.apache.org/documentation/#compactionandretention

Helpful Video: “Apache Kafka – Configuring Retention and Cleanup Policies” by Simplilearn – https://www.youtube.com/watch?v=_xF7k5jpzRQ

Conclusion:

Configuring topic properties and retention policies in Apache Kafka allows you to optimize the behavior of individual topics, ensure data durability, and manage storage space effectively. By configuring the number of partitions, replication factor, log compaction, and retention time, you can customize topics to suit your application’s requirements.

In this article, we explored the process of configuring topic properties and retention policies in Kafka. The provided code samples demonstrated the creation of a topic with configured properties and the configuration of retention policies. The reference links to the official Kafka documentation and the suggested video resource offer further insights into topic configuration and retention policies.

By understanding and effectively configuring topic properties and retention policies, you can build scalable, reliable, and efficient data streaming applications using Apache Kafka.

About Author
Ozzie Feliciano CTO @ Felpfe Inc.

Ozzie Feliciano is a highly experienced technologist with a remarkable twenty-three years of expertise in the technology industry.

kafka-logo-tall-apache-kafka-fel
Stream Dream: Diving into Kafka Streams
In “Stream Dream: Diving into Kafka Streams,”...
ksql
Talking in Streams: KSQL for the SQL Lovers
“Talking in Streams: KSQL for the SQL Lovers”...
spring_cloud
Stream Symphony: Real-time Wizardry with Spring Cloud Stream Orchestration
Description: The blog post, “Stream Symphony:...
1_GVb-mYlEyq_L35dg7TEN2w
Kafka Chronicles: Saga of Resilient Microservices Communication with Spring Cloud Stream
“Kafka Chronicles: Saga of Resilient Microservices...
kafka-logo-tall-apache-kafka-fel
Tackling Security in Kafka: A Comprehensive Guide on Authentication and Authorization
As the usage of Apache Kafka continues to grow in organizations...
1 2 3 58
90's, 2000's and Today's Hits
Decades of Hits, One Station

Listen to the greatest hits of the 90s, 2000s and Today. Now on TuneIn. Listen while you code.