My Profile Photo

about:software


This blog talks about software and systems integration. No developers were harmed in the creation of this blog.. well, mostly, anyway..


Apache Kafka Integration with .net

In my last post on Apache Avro, I hinted at additional use cases for Avro serialzed data. In this post, I’d like to walk through serializing my data to an Apache Kafka topic.

For anyone who is not familiar with it yet, Apache Kafka is a high throughput, distributed, partitioned messaging system. Data is published to Kafka topics where it will become available for consumption by any number of consumers subscribing to the topic.

Solution Setup

One of the interesting things about the Kafka project, is that the implementation for Kafka clients (other than the default jvm client) is not maintained by the project. The idea is for outside implementers who are more familiar with their development platforms have greater velocity in developing clients. For .net itself, the project lists quite a few different external implementations. Unfortunately, not all these appeared to be in the same levels of completion and required a bit of poking around to figure things out. For the little project that I was looking at, we eventually decided to go with Microsoft’s implememtation called Kafkanet.

In order to use the Kafkanet client, start off by downloading the source code and building the solution. So as to make it easier to consume in my solution, I packaged up the binaries in a nuget package. The only dependency needed for this nuget package was the Apache Zookeeper .net client which is available on nuget.org. I’ve added my nuspec file below for reference, should you need it..

<?xml version="1.0" encoding="utf-8"?>
<package xmlns="http://schemas.microsoft.com/packaging/2012/06/nuspec.xsd">
    <metadata>
        <id>Microsoft.Kafkanet</id>
        <version>0.0.58.1</version>
        <title>KafkaNet.Library</title>
        <authors>Microsoft</authors>
        <projectUrl>https://github.com/Microsoft/Kafkanet</projectUrl>
        <iconUrl>https://kafka.apache.org/images/kafka_logo.png</iconUrl>
        <requireLicenseAcceptance>false</requireLicenseAcceptance>
        <description>Build of Microsoft Kafkanet solution https://github.com/Microsoft/Kafkanet</description>
        <dependencies>
            <group targetFramework=".NETFramework4.5">
                <dependency id="ZooKeeper.Net" version="3.4.6.2" />
            </group>
        </dependencies>
    </metadata>
    <files>
        <file src="lib\KafkaNET.Library.dll" target="lib\KafkaNET.Library.dll" />
    </files>
</package>

Sample code

Picking up where I left off the Avro serialization example, here’s some sample code that takes the data and pushes that over to a Kafka topic

//Connect to Kafka instance

var brokerConfig = new BrokerConfiguration()
{
	BrokerId = 0,
	Host = "kafka-dev-instance",
	Port = 9092
}
var config = new ProducerConfiguration(new List<BrokerConfiguration> { brokerConfig });

//Create Avro serialized stream 

var stream = new MemoryStream();
Encoder encoder = new Avro.IO.BinaryEncoder(stream);
var writer = 
	new Avro.Specific.SpecificWriter<Error>(new SpecificDefaultWriter(error.Schema));
writer.Write(error, encoder); 

//Publish to Kafka

var msg = new Kafka.Client.Messages.Message(stream.ToArray());
var producerData = 
	new ProducerData<string, Kafka.Client.Messages.Message>("kafka-topicname", timestamp.value.ToString(), msg);
using (var producer = new Producer<string, Kafka.Client.Messages.Message>(config))
{
	producer.Send(producerData);
} 

… and voila, you are now writing your Avro serilialized data into a Kafka topic. As you can see, the code is mostly straight forward but it did take a few hours of digging in to the code to get this right.

Hope this proves helpful to anyone else trying to do something similar. As always, feel free to leave a comment.

comments powered by Disqus