Cognossimplified

Wednesday, 17 July 2024

Good Use case for Kafka , Flink Streaming - Building a Ad server

https://www.kai-waehner.de/blog/2023/09/15/how-to-build-a-real-time-advertising-platform-with-apache-kafka-and-flink/

Intergrating Kafka with your API

Integrating Kafka with your application API without making direct changes to the API itself can be achieved by using various techniques and tools. This approach typically involves setting up an intermediary service or using a middleware that intercepts API requests and forwards the necessary data to Kafka. Here are a few strategies to achieve this:

Example - You want to know what are the ad request you are making . Example want to know which songs users are playing on spotify

Strategy 1: Sidecar Pattern

The sidecar pattern involves running a companion service alongside your main application that handles Kafka integration. This can be particularly useful in containerized environments like Kubernetes.

Steps to Implement Sidecar Pattern

Create a Sidecar Service:
- Develop a separate service that consumes API logs or data and sends it to Kafka.
Deploy the Sidecar:
- Deploy the sidecar service alongside your main application.

Example Implementation

Sidecar Service (Python):

python
import requests
from confluent_kafka import Producer
import json

# Kafka Producer configuration
conf = {
    'bootstrap.servers': 'localhost:9092',
    'client.id': 'sidecar-producer'
}
producer = Producer(**conf)

def send_to_kafka(topic, message):
    producer.produce(topic, json.dumps(message).encode('utf-8'))
    producer.flush()
    print(f"Message sent to Kafka topic {topic}")

def intercept_and_forward(url, headers, params=None):
    response = requests.get(url, headers=headers, params=params)
    data = response.json()
    send_to_kafka('your_topic', data)
    return data

# Example usage
if __name__ == "__main__":
    url = "http://example.com/api/data"
    headers = {"Authorization": "Bearer token"}
    data = intercept_and_forward(url, headers)
    print(data)

Kubernetes Deployment Example

Kubernetes Pod Definition:

yaml
apiVersion: v1
kind: Pod
metadata:
  name: api-with-sidecar
spec:
  containers:
  - name: main-app
    image: your-main-app-image
    ports:
    - containerPort: 80
  - name: sidecar
    image: your-sidecar-image
    env:
    - name: KAFKA_BOOTSTRAP_SERVERS
      value: "localhost:9092"

Strategy 2: API Gateway

Using an API gateway allows you to manage, secure, and monitor API traffic, as well as intercept and forward data to Kafka without modifying the backend API.

Steps to Implement API Gateway Integration

Set Up API Gateway:
- Use an API gateway solution like Kong, Apigee, or Amazon API Gateway.
Configure Plugins or Middleware:
- Use plugins or middleware to intercept requests and forward data to Kafka.

Example with Kong Gateway

Install Kong and Kafka Plugin:
- Follow the Kong installation guide.
- Install the Kafka logging plugin for Kong.
Configure Kafka Plugin:

yaml
services:
- name: example-service
  url: http://example.com
  routes:
  - name: example-route
    paths:
    - /api

plugins:
- name: kafka-log
  config:
    bootstrap_servers: "localhost:9092"
    topic: "your_topic"
    producer_config:
      acks: "all"

Apply Configuration:

bash
# Assuming you have a Docker-based setup
docker-compose up -d

Strategy 3: Log-Based Integration

You can also use a log-based approach where application logs are collected and processed to send data to Kafka.

Steps to Implement Log-Based Integration

Set Up Log Forwarder:
- Use a log forwarder like Filebeat, Fluentd, or Logstash.
Configure Log Forwarder to Send Data to Kafka:

Filebeat Configuration Example:

yaml
filebeat.inputs:
- type: log
  paths:
    - /var/log/myapp/*.log

output.kafka:
  hosts: ["localhost:9092"]
  topic: "your_topic"
  codec.json:
    pretty: true

Run Log Forwarder:

bash
filebeat -e -c filebeat.yml

Summary

These strategies allow you to integrate Kafka with your application API without making direct changes to the API code:

Sidecar Pattern: Run a companion service alongside your application.
API Gateway: Use an API gateway to intercept and forward data.
Log-Based Integration: Use log forwarders to process and send logs to Kafka.

Choose the strategy that best fits your architecture and operational requirements.

Monday, 21 August 2023

Interview Preparation Topics - Manager

https://igotanoffer.com/blogs/tech/facebook-engineering-manager-interview

People Management
Project Management
Culture Fit
System Design
Coding Interview

Tuesday, 6 June 2023

Leadership and Growth Mindset

https://sites.google.com/deliveroo.com/tech-leadership/engineering-manager-prep/leadership-growth-mindset

Behaviour Interview Prep - Manager

https://sites.google.com/deliveroo.com/tech-leadership/engineering-manager-prep/behavioural

Resolving Conflicts

Principle of conflict resolution

everyone is right in their eyes
avoid false dichotomies (Black or white)
people generally present a position , but rarely
that is the issue, the person resolving issue need to understand the real issue by using listening skills
Dont dig in, that is dont try to explain why you think you are right, i am right because it leads to escalations, look for sound of argument
dont use exagerated statement - you always
seek unity of purpose
focus on where we are going
its not person but process -- how can we make sure
such conflicts do not appear in future example RACI Framework
look for win wins mutual benefits
long term -- develop a culture where people listen to each other

When it comes to creating resolving conflicts a few of these communication skills come in Handy.

Empathy , when you empathise with people it allows them to move over things , otherwise you will hear people saying the same things in many conversations
Acknowlegement - When listening to someone acknowlegement like nodding , saying yes motivates the speaker, absense of these kinds of motivation leads to demotivation
Reflecting or Rephrasing -- Listening is more than Hearing, we need to understand what the other person is feeling and rephrase it. Reflecting is showing the other person mirror. This is what i think you said
Brainstorming - Is allowing people to come up with ideas without Judging. Note that a idea might not be good but it might lead someone to come up with better idea on those lines

Framewok for conflict resolution

Find Agreement
Clarify disagreement
express empathy
seek alternatives
Agree on criteria for solution
Combine and create
Reach consensus

Project Walkthrough for Interview in STAR Format

Overview of Project

Company had roadmap to create a Financial Data lake. Software engineering team Was tasked with creating infrastructure for this data lake

that is the actual data lake to store the data
ETL tool which would connect to this datalake on which jobs needs to be created
Tools to query this data lake

Data Engineering team which I was leading was tasked with Migrating data pipeline to datalake, using tools developed by Software engineering team. Wireframes on how this tool would look like was shared with team and data Engineering team waited for Software engineering tool to complete

Their work so we can start working on pipeline migration.

STAR Format follows

Situation (Challenges you faced)

6 months prior to the data lake project completion date, My senior manager discusses the project and asks me to get started. At this time his understanding is the same that a tool would be developed.
After discussing with Software engineering team, we understand that they have hit roadblocks and entire effort is delayed, They have not even started the tool devleopment and have not completed the Datalake infrastructure development
I escalated it to senior management clarifying that Data Engineering project cannot start until ETL tool development is completed.
After multiple hard discussions with Senior management , Software engineering team accepted that they have completed missed timeline and there is no time available for Data engineering to migrate.
Software engineering team will not work on creation of tool, instead they will only work on Creation of Data lake API which will be used by Data engineering team in their solution
We were to run the development efforts in Parallel with Data lake API Development
Data engineering will work on migration using AWS platform and Data lake API

Tasks (Decision you took )

Were tasked with creating a approach for migration, with aim of completing it in next 6 months.

Action (How you worked with your team and cross functional peers)

Discussed with team came up with few approaches how we could do the project
Had POC done on multiple approaches in a weeks timeframe , presented pro and cons to management
Decided on approach to follow
Lead development effort on solution proposed by team , During development optimized the current pipeline which lead to improvement in SLA
Had to deal with mutliple issue as Data lake API were not fully matured and had bugs which had potential to derail the roadmap
Migrated the ETL data pipeline in next 6 months
Created a resuable solution which could be used by mulitple teams

Result (Metrics that was impacted)

Were able to migrate the pipeline before anticipate time, allow ample of time for use testing and overall success of Datalake
Reduce time for data delay from 6 hours to 1 hour
Showcased to organization best practices around ETL pipeline development. Infrastructure as code , Unit test cases for Data pipelines. Performance optimizations