cv
General Information
Full Name | Harish Kesava Rao |
Languages | English |
Education
-
2011 Master of Science
University of Arizona, Tucson, AZ, USA - Major in Management Information Systems.
- Nationally, number 1 public graduate information systems program.
- Courses
- Enterprise Data Management
- Business Intelligence
- Business Communication
- Web Mining and Analytics
- Data Mining
- Software Design Patterns
- Operations Management
-
2007 Bachelor of Technology
Anna University - Major in Information Technology.
Experience
-
2024 - Present Principal Data Engineer
Atlassian - Planning the short-term and long-term technology roadmap for Data Engineering projects.
- Guiding Data Engineers and Lead Data Engineers on design, data architecture for mutliple streams.
- Resolving ambiguity and arriving at clear, actionable decisions; Help achieve trade-offs between velocity and quality.
- Providing constructive and clear feedback during code reviews and design reviews.
- Helping the team succeed in building robust, scalable, auditable pipelines to create a performant Data Lake on AWS (ECS, S3, Airflow), Airflow and Databricks.
- Key areas/skills.
- Databricks
- Delta lake storage
- AWS - S3, SQS, SNS, Kinesis
- Spark - Batch, performance tuning
- DBT, Jinja templates
-
2022 - 2024 Staff Software Engineer/Team Lead, Data Engineering
Databricks - Managing multiple large-scale Data Engineering initiatives. Mentoring and advising Data Engineers.
- Deploying data pipelines and associated resources on AWS, Azure via Terraform (HCL) on Databricks workspaces.
- Creating spark ingestion notebooks, tuning streaming and batch spark jobs and clusters on Azure and AWS.
- Ingesting data from REST APIs and storing them on AWS S3 via standard Python frameworks.
- Key areas/skills.
- Databricks
- Delta lake
- AWS - S3, SQS, SNS, Kinesis, CodeBuild
- Azure - Storage Blob, Eventgrid, Eventhubs
- Spark - Streaming, batch, performance tuning
- Terraform - resource management automation for Databricks, AWS and Azure resources.
- Parquet, JSON file management
- Hive metastore
-
2021 - 2022 Senior Data Engineer
Salesforce - Augment Tableau's license lifecycle analysis with AWS compute and storage alongside Snowflake.
- Key areas/skills.
- AWS - S3, EMR, Pyspark.
- Snowflake
- Tableau integration with Python.
-
2020 - 2021 Senior Data Engineer
Amazon Prime Video - First Data Engineer for Prime Video Search.
- Designed and implemented a Data Lake for Prime Video Search using EMR, Spark, Scala, S3, Athena, Tableau, SageMaker.
- Key areas/skills.
- AWS - EMR, S3, SageMaker, Athena, Pyspark.
-
2017 - 2020 Senior Data Engineer
Indeed - Designed, standardized and automated DW/data pipelines using Postgres, Hive, Hadoop, Snowflake and Airflow.
- Key areas/skills.
- Python
- Postgres
- Pyspark
- Hive
- Docker
- Airflow
-
2014 - 2017 Senior ETL Engineer
Informatica - Developed and deployed ETL pipelines, data warehouses in Oracle, MySQL, MS SQL Server, Netezza, Teradata using Informatica.
- Used Python to implement pipelines to consume raw/unstructured data.
- Key areas/skills.
- Informatica PowerCenter, Data Quality, Metadata Manager, Data Replication, Big Data Edition, Cloud Edition.
- Python
-
2012 - 2013 Presales Technical Consultant
Informatica - Product demos for prospects, technical Proof Of Concept engagements.
- Key areas/skills.
- Informatica PowerCenter.
Open Source Projects
-
2021 - now Contributions to Apache Airflow
- Contributions to various providers in Airflow.