https://www.kai-waehner.de/blog/2023/09/15/how-to-build-a-real-time-advertising-platform-with-apache-kafka-and-flink/
Cognossimplified
Wednesday 17 July 2024
Intergrating Kafka with your API
Integrating Kafka with your application API without making direct changes to the API itself can be achieved by using various techniques and tools. This approach typically involves setting up an intermediary service or using a middleware that intercepts API requests and forwards the necessary data to Kafka. Here are a few strategies to achieve this:
Example - You want to know what are the ad request you are making . Example want to know which songs users are playing on spotify
Strategy 1: Sidecar Pattern
The sidecar pattern involves running a companion service alongside your main application that handles Kafka integration. This can be particularly useful in containerized environments like Kubernetes.
Steps to Implement Sidecar Pattern
Create a Sidecar Service:
- Develop a separate service that consumes API logs or data and sends it to Kafka.
Deploy the Sidecar:
- Deploy the sidecar service alongside your main application.
Example Implementation
Sidecar Service (Python):
pythonimport requests
from confluent_kafka import Producer
import json
# Kafka Producer configuration
conf = {
'bootstrap.servers': 'localhost:9092',
'client.id': 'sidecar-producer'
}
producer = Producer(**conf)
def send_to_kafka(topic, message):
producer.produce(topic, json.dumps(message).encode('utf-8'))
producer.flush()
print(f"Message sent to Kafka topic {topic}")
def intercept_and_forward(url, headers, params=None):
response = requests.get(url, headers=headers, params=params)
data = response.json()
send_to_kafka('your_topic', data)
return data
# Example usage
if __name__ == "__main__":
url = "http://example.com/api/data"
headers = {"Authorization": "Bearer token"}
data = intercept_and_forward(url, headers)
print(data)
Kubernetes Deployment Example
Kubernetes Pod Definition:
yamlapiVersion: v1
kind: Pod
metadata:
name: api-with-sidecar
spec:
containers:
- name: main-app
image: your-main-app-image
ports:
- containerPort: 80
- name: sidecar
image: your-sidecar-image
env:
- name: KAFKA_BOOTSTRAP_SERVERS
value: "localhost:9092"
Strategy 2: API Gateway
Using an API gateway allows you to manage, secure, and monitor API traffic, as well as intercept and forward data to Kafka without modifying the backend API.
Steps to Implement API Gateway Integration
Set Up API Gateway:
- Use an API gateway solution like Kong, Apigee, or Amazon API Gateway.
Configure Plugins or Middleware:
- Use plugins or middleware to intercept requests and forward data to Kafka.
Example with Kong Gateway
Install Kong and Kafka Plugin:
- Follow the Kong installation guide.
- Install the Kafka logging plugin for Kong.
Configure Kafka Plugin:
yamlservices:
- name: example-service
url: http://example.com
routes:
- name: example-route
paths:
- /api
plugins:
- name: kafka-log
config:
bootstrap_servers: "localhost:9092"
topic: "your_topic"
producer_config:
acks: "all"
- Apply Configuration:
bash# Assuming you have a Docker-based setup
docker-compose up -d
Strategy 3: Log-Based Integration
You can also use a log-based approach where application logs are collected and processed to send data to Kafka.
Steps to Implement Log-Based Integration
Set Up Log Forwarder:
- Use a log forwarder like Filebeat, Fluentd, or Logstash.
Configure Log Forwarder to Send Data to Kafka:
Filebeat Configuration Example:
yamlfilebeat.inputs:
- type: log
paths:
- /var/log/myapp/*.log
output.kafka:
hosts: ["localhost:9092"]
topic: "your_topic"
codec.json:
pretty: true
- Run Log Forwarder:
bashfilebeat -e -c filebeat.yml
Summary
These strategies allow you to integrate Kafka with your application API without making direct changes to the API code:
- Sidecar Pattern: Run a companion service alongside your application.
- API Gateway: Use an API gateway to intercept and forward data.
- Log-Based Integration: Use log forwarders to process and send logs to Kafka.
Choose the strategy that best fits your architecture and operational requirements.
Monday 21 August 2023
Interview Preparation Topics - Manager
https://igotanoffer.com/blogs/tech/facebook-engineering-manager-interview
- People Management
- Project Management
- Culture Fit
- System Design
- Coding Interview
Tuesday 6 June 2023
Leadership and Growth Mindset
https://sites.google.com/deliveroo.com/tech-leadership/engineering-manager-prep/leadership-growth-mindset
Behaviour Interview Prep - Manager
https://sites.google.com/deliveroo.com/tech-leadership/engineering-manager-prep/behavioural
Resolving Conflicts
- everyone is right in their eyes
- avoid false dichotomies (Black or white)
- people generally present a position , but rarely
- that is the issue, the person resolving issue need to understand the real issue by using listening skills
- Dont dig in, that is dont try to explain why you think you are right, i am right because it leads to escalations, look for sound of argument
- dont use exagerated statement - you always
- seek unity of purpose
- focus on where we are going
- its not person but process -- how can we make sure
- such conflicts do not appear in future example RACI Framework
- look for win wins mutual benefits
- long term -- develop a culture where people listen to each other
When it comes to creating resolving conflicts a few of these communication skills come in Handy.
- Empathy , when you empathise with people it allows them to move over things , otherwise you will hear people saying the same things in many conversations
- Acknowlegement - When listening to someone acknowlegement like nodding , saying yes motivates the speaker, absense of these kinds of motivation leads to demotivation
- Reflecting or Rephrasing -- Listening is more than Hearing, we need to understand what the other person is feeling and rephrase it. Reflecting is showing the other person mirror. This is what i think you said
- Brainstorming - Is allowing people to come up with ideas without Judging. Note that a idea might not be good but it might lead someone to come up with better idea on those lines
- Find Agreement
- Clarify disagreement
- express empathy
- seek alternatives
- Agree on criteria for solution
- Combine and create
- Reach consensus
Project Walkthrough for Interview in STAR Format
Overview of Project
Company had roadmap to create a Financial Data lake. Software engineering team Was tasked with creating infrastructure for this data lake
- that is the actual data lake to store the data
- ETL tool which would connect to this datalake on which jobs needs to be created
- Tools to query this data lake
Data Engineering team which I was leading was tasked with Migrating data pipeline to datalake, using tools developed by Software engineering team. Wireframes on how this tool would look like was shared with team and data Engineering team waited for Software engineering tool to complete
Their work so we can start working on pipeline migration.
STAR Format follows
Situation (Challenges you faced)
- 6 months prior to the data lake project completion date, My senior manager discusses the project and asks me to get started. At this time his understanding is the same that a tool would be developed.
- After discussing with Software engineering team, we understand that they have hit roadblocks and entire effort is delayed, They have not even started the tool devleopment and have not completed the Datalake infrastructure development
- I escalated it to senior management clarifying that Data Engineering project cannot start until ETL tool development is completed.
- After multiple hard discussions with Senior management , Software engineering team accepted that they have completed missed timeline and there is no time available for Data engineering to migrate.
- Software engineering team will not work on creation of tool, instead they will only work on Creation of Data lake API which will be used by Data engineering team in their solution
- We were to run the development efforts in Parallel with Data lake API Development
- Data engineering will work on migration using AWS platform and Data lake API
Tasks (Decision you took )
- Were tasked with creating a approach for migration, with aim of completing it in next 6 months.
Action (How you worked with your team and cross functional peers)
- Discussed with team came up with few approaches how we could do the project
- Had POC done on multiple approaches in a weeks timeframe , presented pro and cons to management
- Decided on approach to follow
- Lead development effort on solution proposed by team , During development optimized the current pipeline which lead to improvement in SLA
- Had to deal with mutliple issue as Data lake API were not fully matured and had bugs which had potential to derail the roadmap
- Migrated the ETL data pipeline in next 6 months
- Created a resuable solution which could be used by mulitple teams
Result (Metrics that was impacted)
- Were able to migrate the pipeline before anticipate time, allow ample of time for use testing and overall success of Datalake
- Reduce time for data delay from 6 hours to 1 hour
- Showcased to organization best practices around ETL pipeline development. Infrastructure as code , Unit test cases for Data pipelines. Performance optimizations