Wednesday 19 April 2023

Interview Use cases to talk about

 Engineering Challenges solved 

Building Data lake 

- A system built over time was moved to Datalake 

- Initial system has multiple redshifts / Compaction issues / Multiple s3 paths from which data was consumed by customers

- Ownership was divided among teams but not logically 

- access was not controlled , redundant access 



STAR format - How was the impact measured 


- SLA improvement -- Team was able to redesign pipeline during migration to datalake and take our redundant steps improving sla 6 hours to 3 hours 

- Cost saving of 200k by moving processing


People Challenges solved 


Hiring and Recruiting Issues solved 


Big Projects Handled 


Appraisal Ratings 


Cost Saving 

- What is cost of Redshift 

- What is cost of EMR 

- Cost of Athena 

- Number of nodes 


Ra3.16x Large - Reserved instance - 75,000 - 48vcpu , 384 ram , 128TB space , Scales up to 16 petabytes 

DS2- 8x large - 16TB , HDD , 244gb memory ,  -- We had 50 Node cluster - DS2 is deprecated - 30 thousand 

DC2-8x - 32vcpu ,244 gb memory,  2.5TB SSD, 

EMR Type used - R5d - 48 VCPU, 512 memory - Supports upto EMR 6.3 

No comments:

Post a Comment