Harish Kesava Rao

Hello, thank you for visiting my page.

I am Harish and I have been in the Data space for over a decade. Starting from an entry-level ETL developer in 2008, I progressed steadily in the field as the industry evolved.

In 2010, I quit my job and enrolled in a Masters Degree in Management Information Systems at the University of Arizona’s Eller College of Management. It is one of the best places to do Masters coursework focusing on Business Intelligence, Big Data and Machine Learning. It also ranks as the #1 public institution for Graduate studies in MIS in the US. Alongside my coursework, I also worked part-time as an ETL programmer for the Mining and Geological Engineering Department in my university, helping mining researchers save costs and optimize mining trucks’ routes via data.

Post my graduation in 2012, I joined Informatica as a Presales Engineer and then moved onto become a Professional Services Engineer. This was in the traditional relational databases era.

Then in 2017, I moved onto become a Data Engineer at Indeed.com, allowing me to leverage my data engineering skills to power marketing, finance and product data marts to derive value for job seekers. I also got promoted to Senior Data Engineer. We used Postgres, Python and Airflow, and then moved onto Python, PySpark and Hadoop.

In 2020, I joined Amazon Prime Video as the first Data Engineer for the Prime Video Search division, where I single-handedly designed and delivered Proof-Of-Concept for a Data Lake on AWS. It was one of the most enriching experiences in my career to build a peta-byte scale Data Lake on AWS and solve some interesting search prediction challenges. The tech stack included PySpark, AWS.

In 2022, I joined Databricks as a Staff Data Infrastructure Engineer, and designed, implemented low-latency streaming pipelines to ingest data from a variety of APIs and other data sources in Databricks, AWS, Azure and Terraform (for Infra management).

Currently, I am a Principal Data Engineer and Architect, helping Senior Data Engineers design and deliver projects for the Customer Support Services division of Atlassian. I also code everyday in Python, SQL, Databricks.

I also contribute to Open Source projects and you can learn more it and also about me in my portfolio as well – https://harishkesavarao.github.io/

news

Jul 1, 2025	[Open Source] Submitted my first PR to Datahub: The Data Discovery Platform for the Modern Data Stack.
Mar 29, 2025	[Talks] Guest lecture to Undergraduate students and faculty of the Kongu Engineering College’s Department of Artificial Intelligence and Data Science. Topic: Building a career in Data
Apr 29, 2024	[Update] Joined Atlassian India as Principal Data Engineer & Data Architect.
Apr 30, 2023	[Open Source] Created the Databricks Partition Sensor (for the Databricks Provider) for Apache Airflow.
Apr 2, 2023	[Open Source] First major contribution to Apache Airflow – Databricks SQL Sensor for Airflow.

latest posts

Mar 1, 2023	Building a data lake on Microsoft Azure.
Jun 1, 2021	Building a data lake on Amazon Web Services.
Nov 23, 2019	Deploying on-premise big data pipelines.