Quantcast
Channel: ATeam Chronicles
Viewing all articles
Browse latest Browse all 987

Oracle Service Bus Transport for Apache Kafka (Part 1)

$
0
0

Introduction

Few of us realize it, but the heart of OOP (Object-Oriented Programming) has more to do with messaging than with objects and classes or related design principles like inheritance and polymorphism. At very beginning, OOP was designed to help developers implement data exchange between objects, so these objects would be coordinated to execute some business process and divide the problem into smaller pieces with a key benefit being code that is easier to maintain. Another original OOP design concept dictates that all objects in the system are accountable for data sharing; no object holds more responsibility than another regarding messaging. This design principle helps OO applications to scale as a whole because objects can scale individually.

Considering that the firsts object-oriented languages (i.e.: LISP, MIT ALGOL, Simula 67) were created in the 60’s, we can all agree that messaging has been around for at least four decades. However, messaging looks quite different today from these original OO concepts and this discrepancy is a direct result of industry demands and technology evolution. Let’s do a quick review:

- To increase performance, objects were supposed to run concurrently using threads;

- To increase scalability, objects were supposed to be spread across different machines;

- To increase availability, messages were stored in a central repository within machines;

- To increase manageability, this repository controls most part of the message handling;

- To increase adoption, standards were created having in mind all these characteristics;

Most of these characteristics define what we refer to today as MOM (Message-Oriented Middleware). This type of middleware is commonly used for exchanging messages in a reliable, fault-tolerant and high performance way. Numerous standards were created to define APIs for MOM’s, with JMS (Java Message Service) and AMQP (Advanced Message Queuing Protocol) as perhaps the most popular ones, with many great implementations available, both commercial and open-source.

But industry demands never slow down, and like any other technology, MOM’s have to keep up with them in order not to fall into obsolescence. There is one major demand currently threatening most implementations and that is data volume: the need to handle not hundreds or thousands but millions of messages per second. The volume of data generated had increased exponentially in the last decade and keeps growing every second. Most implementations follow a MOM design concept – dedicated by an API specification – in which the consumers and producers need to be as lightweight as possible, so the MOM ends up handling most parts of the messaging. While this is good from an ease-of-use standpoint, it creates a serious bottleneck in the architecture that eventually prevents it from scaling up and meets these ever increasing data volumes.

Of course, not all applications need to handle millions of messages simultaneously and most of these implementations might seem “good-enough”. However, what happens if one of these applications needs to be modified to handle a high data volume scenario? What are the consequences? Common situations include application adjustments and fine tuning that by itself bring a new set of complexities to the architecture. For instance:

- Disable message persistence to improve performance, but potentially losing messages;

- For JVM-based implementations, increase the heap size, and possibly cause long GC pauses;

- Trimming the Java EE application server to use only JMS, perhaps losing flexibility and supportability;

- Giving producer/consumer’s more responsibility, but breaking the API specification;

- Using high-speed appliances to improve performance, but also increasing the TCO;

It seems that industry demands over the years caused us to deviate a lot from the original OOP design that dictates that the system scales as a whole when each object scales individually. This principle is more critical than ever in an era where scalability is the most important driver due to high data volumes. Finally, we’re also in the Cloud Computing era where a significant percentage of enterprise applications and computing infrastructure is either running in, or moving to, the Cloud.

Some years ago, a few LinkedIn enterprise architects responsible for the company’s data infrastructure faced the same challenges while working to build data pipelines throughout the company. They were trying to capture and manage relational database changes, NoSQL database changes, ETL tasks, messaging, metrics, and integrate tools like Apache Hadoop, etc. After ending up with different approaches for each scenario, they realized that maybe one single approach would be more maintainable, but keeping in mind all the challenges from each scenario related with scalability. This is the simplistic overview about how Apache Kafka was created. Apache Kafka (Kafka for short) is a distributed, partitioned and replicated commit log service that provides features similar to other messaging systems, but with a unique design that allows it to handle terabytes of data with a single-node. Most importantly – the Kafka architecture allows it to scale as whole because each layer (including producers and consumers) is designed from the ground up to be scalable.

After have seen numerous requests from customers and partners about being able to integrate with Kafka, the A-Team decided to write a native transport for Oracle Service Bus (Service Bus for short) to allow the connection and data exchange with Kafka – supporting message consumption and production to Kafka topics. This is done in a way that allows Service Bus to scale jointly with Kafka, both vertically and horizontally.

The Apache Kafka transport is provided for free to use “AS-IS” but without any official support from Oracle. Bugs, feedback and enhancement requests are welcome but need to be performed using the comments section of this blog and the A-Team reserves the right of help in the best-effort capacity.

In order to detail how this transport works, an article divided in three parts will be provided. This first part of the article will provide a basic introduction, covering how the transport can be installed and how to create Proxy and Business services to read and write from/to Kafka. The second part of the article will cover more advanced details like how to implement message partitioning and how to scale the architecture as a whole. The third and last part will discuss common use cases for this transport. This article will not cover Kafka in detail but will instead focus on the Service Bus aspects of the solution. If you need a Kafka overview, it is strongly recommended you read its official documentation and spend some time trying out its quick start. Jay Kreps, one of the Kafka co-creators, wrote a very good in-depth article about Kafka that also covers all the details.

Installing the Transport in Service Bus

The first step is to make the Kafka main libraries available on the Service Bus classpath. At runtime, the transport implementation use classes from these libraries to establish connections to the Kafka brokers and to Zookeeper. As a best practice, put these libraries in the following folder: $OSB_DOMAIN/lib. JAR files within this folder will be picked up and added dynamically to the end of the server classpath at server startup. Considering Kafka version 0.8.1.1, the libraries that need to be copied are:

$KAFKA_HOME/libs/kafka_2.9.2-0.8.1.1.jar

$KAFKA_HOME/libs/log4j-1.2.15.jar

$KAFKA_HOME/libs/metrics-core-2.2.0.jar

- $KAFKA_HOME/libs/scala-library-2.9.2.jar

$KAFKA_HOME/libs/zkclient-0.3.jar

$KAFKA_HOME/libs/zookeeper-3.3.4.jar

Here, $KAFKA_HOME should point to the exactly location where Kafka was installed. Keep in mind that the list shown before presents files of Kafka version 0.8.1.1. If you decide to use a more recent version of Kafka, the filenames may be quite different.

The second step is deploying the transport within the Service Bus domain. You can grab a copy of the transport here. Download the zip file and extract its contents to a staging folder. Copy the implementation files (kafka-transport.ear and kafka-transport.jar) to the same folder where the other transports are kept:

$MW_HOME/osb/lib/transports

And copy the JDeveloper plugin descriptor (transport-kafka.xml) to the plugins folder:

$MW_HOME/osb/config/plugins

Finally, deploy the kafka-transport.ear file as an enterprise application to your Service Bus domain. You can use any deployment method of your choice (Administration Console, WLST, etc) just make sure to target this deployment to all servers and clusters within your domain, including the administration server. As a best practice, name this deployment as “Service Bus Kafka Transport Provider”.

Receiving Data from Apache Kafka

This section will show how to create a Service Bus Proxy Service that fetches messages from a Kafka topic and processes them using a Pipeline. The following steps will assume that you have an up-and-running Kafka deployment, and that a topic named “orders” was previously created.

Creating the Service Bus Artifacts

1) In the Service Bus Console, create a session to modify the projects.

2) Create a new project called “article” under the projects folder.

3) Right-click the newly created project and choose Create > Pipeline.

4) Name this Pipeline as “OrdersListenerImpl”.

5) In the Service Type field, choose “Messaging”

6) Set “Text” as the Request Type and “None” as the Response Type.

7) Leave the “Expose as a Proxy Service” check box unchecked.

8) Click in the Create button to finish.

9) Open the Message Flow editor.

10) Implement the Message Flow with a Pipeline-Pair, creating one single stage in the request pipeline that prints the headers and the body of the incoming message. Here is an example of this implementation.

figure-1

11) Right-click the article project and choose Create a Proxy Service.

12) Name this Proxy Service as “OrdersListener”.

13) In the Protocol field choose “kafka” then, click in the Next button.

figure-2

14) Set “Text” as the Request Type and “None” as the Response Type.

15) Click in the Next button to continue.

16) In the Endpoint URI field, enter with the Zookeeper server details.

figure-3

17) Click on the Create button to finish.

18) In the General page, associate this Proxy Service with the previously created Pipeline.

figure-4

19) Click on the Transport Details tab.

20) In the Topic Name field, enter with “orders”.

figure-5

21) Save the entire configuration and activate the changes.

Testing the Scenario

In order to test this scenario, you can use the producer utility tool that comes with Kafka. In the “bin” folder of your Apache Kafka installation, type:

$KAFKA_HOME/bin/kafka-console-producer.sh –broker-list localhost:9092 –topic orders <ENTER>

This will open the producer console for sending messages. Anything you type and hit the enter key will be immediately sent to the Kafka topic. For this simple test, type:

Hello World from Apache Kafka <ENTER>

You should see an output on Service Bus like this:

<Jun 16, 2015 7:02:41 PM EDT> <Warning> <oracle.osb.logging.pipeline> <BEA-000000> < [HandleIncomingPayload, HandleIncomingPayload_request, Logging, REQUEST] <con:endpoint name="ProxyService$article$OrdersListener" xmlns:con="http://www.bea.com/wli/sb/context">
  <con:service/>
  <con:transport>
    <con:uri>localhost:2181</con:uri>
    <con:mode>request</con:mode>
    <con:qualityOfService>best-effort</con:qualityOfService>
    <con:request xsi:type="kaf:KafkaRequestMetaDataXML" xmlns:kaf="http://oracle/ateam/sb/transports/kafka" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
      <tran:headers xsi:type="kaf:KafkaRequestHeadersXML" xmlns:tran="http://www.bea.com/wli/sb/transports">
        <kaf:message-key/>
        <kaf:partition>1</kaf:partition>
        <kaf:offset>5</kaf:offset>
      </tran:headers>
    </con:request>
  </con:transport>
  <con:security>
    <con:transportClient>
      <con:username>weblogic</con:username>
      <con:principals>
        <con:group>AdminChannelUsers</con:group>
        <con:group>Administrators</con:group>
        <con:group>IntegrationAdministrators</con:group>
        <con:group>Monitors</con:group>
      </con:principals>
    </con:transportClient>
  </con:security>
</con:endpoint>> 

<Jun 16, 2015 7:02:41 PM EDT> <Warning> <oracle.osb.logging.pipeline> <BEA-000000> < [HandleIncomingPayload, HandleIncomingPayload_request, Logging, REQUEST] Hello World from Apache Kafka>

Sending Data to Apache Kafka

This section will show how to create a Service Bus Business Service that sends messages to a Kafka topic. The following steps will assume that you have an up-and-running Kafka deployment, and that a topic named “orders” was previously created.

Creating the Service Bus Artifacts

1) In the Service Bus Console, create a session to modify the projects.

2) Right-click the article project and choose Create > Business Service.

3) Name this Business Service as “OrdersProducer”.

4) In the Transport field, choose “kafka” then, click in the Next button.

figure-6

5) Set “Text” as the Request Type and “None” as the Response Type.

6) Click on the Next button to continue.

7) In the Endpoint URIs section, enter with the Kafka broker server details.

figure-7

8) Click the Create button to finish.

9) Click the Transport Details tab.

10) In the Topic Name field, enter with “orders”.

figure-8

11) Save the entire configuration and activate the changes.

Testing the Scenario

In order to test this scenario, you can use the Test Console that comes with Service Bus.

figure-9

Type “Hello World from Oracle Service Bus” and click on the Execute button to perform the test. Since we have already deployed a Proxy Service that fetches messages from the Kafka topic, you should see a similar output on Service Bus:

<Jun 16, 2015 7:27:48 PM EDT> <Warning> <oracle.osb.logging.pipeline> <BEA-000000> < [HandleIncomingPayload, HandleIncomingPayload_request, Logging, REQUEST] <con:endpoint name="ProxyService$article$OrdersListener" xmlns:con="http://www.bea.com/wli/sb/context">
  <con:service/>
  <con:transport>
    <con:uri>localhost:2181</con:uri>
    <con:mode>request</con:mode>
    <con:qualityOfService>best-effort</con:qualityOfService>
    <con:request xsi:type="kaf:KafkaRequestMetaDataXML" xmlns:kaf="http://oracle/ateam/sb/transports/kafka" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
      <tran:headers xsi:type="kaf:KafkaRequestHeadersXML" xmlns:tran="http://www.bea.com/wli/sb/transports">
        <kaf:message-key/>
        <kaf:partition>1</kaf:partition>
        <kaf:offset>6</kaf:offset>
      </tran:headers>
    </con:request>
  </con:transport>
  <con:security>
    <con:transportClient>
      <con:username>weblogic</con:username>
      <con:principals>
        <con:group>AdminChannelUsers</con:group>
        <con:group>Administrators</con:group>
        <con:group>IntegrationAdministrators</con:group>
        <con:group>Monitors</con:group>
      </con:principals>
    </con:transportClient>
  </con:security>
</con:endpoint>> 

<Jun 16, 2015 7:27:48 PM EDT> <Warning> <oracle.osb.logging.pipeline> <BEA-000000> < [HandleIncomingPayload, HandleIncomingPayload_request, Logging, REQUEST] Hello World from Oracle Service Bus>

Development using FMW JDeveloper 12c

Alternatively, you can use the Fusion Middleware JDeveloper 12c to create Service Bus projects and also implement integration scenarios using the Kafka transport. This is probably a better approach than using the Service Bus Console when designing end-to-end integration flows, since you can leverage other features available in the IDE such as the debugger, mappers, XPath editors, drag-and-drop components, etc.

figure-10-1

Depending of which version of the Fusion Middleware JDeveloper you use, there will be some warnings in the log stating “Failed to invoke edit for <ENDPOINT>: SCA endpoint not available”. This is a known bug related to custom transports that Oracle engineering is already working to solve. But apart from these warnings everything works fine and you will be able to deploy the Service Bus project from the IDE normally. Also, you will notice that any custom transport is not available in the palette, but you can create then using the File > New approach.

Short Interview with Jay Kreps

While I was writing this article, I had the pleasure and honor to talk with Jay Kreps, one of the Kafka co-creators and CEO of Confluent – the company founded by the team who built Kafka and which provides an enterprise-ready version of Kafka called Stream Data Platform. Here’s a transcript of that brief interview.

- Tell us a Little Bit About Yourself:

“I’m the CEO at Confluent. I’m also one of the co-creators of Apache Kafka. I was previously one of the lead architects for data infrastructure at LinkedIn where Kafka was developed.”

- How popular is Apache Kafka Today?

“It’s really popular. You can see some of the user’s who have added themselves publically here

- What do you Think about the Service Bus Kafka Transport?

“I’m really excited to see the integration. We want to be able to plug in everything!”

- Would you like to see Oracle Supporting Kafka on its Cloud Offerings?

“Sure!”

On behalf of the Oracle community, I would like to thank Jay for his time and support on this project.

Conclusion

Apache Kafka is getting a considerable amount of traction from the industry with its ability to handle high volumes of data (terabytes-scale) with minimum hardware resources, allowing Cloud-based architectures to achieve high throughput and low latency messaging. Providing a native transport that supports Kafka in Service Bus allows for some interesting possibilities and we hope customers consider this option when thinking about using these technologies to solve challenging integration and messaging problems.

This article is the first part of a series that intend to show how Service Bus can be leveraged to connect and exchange data with Kafka, allowing developers and architects to gain from the both worlds. As soon other parts of the article are available, links for them will be appropriately provided here.


Viewing all articles
Browse latest Browse all 987

Trending Articles