Ram Narasimha reddy - Data engineer |
[email protected] |
Location: Dallas, Texas, USA |
Relocation: Yes |
Visa: OPT EAD |
RAMNARSIMHA VARKALA
Data Engineer [email protected] | 816-807-1820 | LinkedIn Professional summary Overall 7 years of hands-on experience as a Data Engineer in Data Warehousing, Data Integration, and Data Modeling, including design, development, coding, testing, bug fixing, and production support of data warehousing applications. Experienced cloud data engineer proficient in Azure and AWS. Specializes in designing scalable data pipelines with tools like Hadoop, AWS Glue, Azure Data Factory and Databricks. Proficient in creating visually appealing user interfaces using HTML, CSS, and popular frameworks. Results-driven Data Engineer with expertise in designing and implementing scalable data ingestion pipelines using Azure Data Factory. Proficient in leveraging Azure Databricks and Spark for distributed data processing and transformation tasks. Skilled in ensuring data quality and integrity through validation, cleansing, and transformation operations. Adept at designing cloud-based data warehouse solutions using Snowflake on Azure, optimizing schemas, tables, and views for efficient data storage and retrieval. Experience with Snowflake Multi - Cluster Warehouses. Extensive experience optimizing data storage and retrieval using AWS services like Amazon DynamoDB, Amazon RDS, and Amazon Aurora. Expertise in data modeling and designing efficient data warehouses using AWS services like Amazon Redshift and Amazon Athena. Designed and developed Tableau dashboards for visualizing complex data sets, providing actionable insights to stakeholders. Developed and maintained complex data pipelines using Apache Airflow for automation. Proficient in using Agile project management tools such as Jira and Trello for tracking and managing tasks, user stories, and team progress. Adept at using Apache Kafka for real-time data streaming and processing. Experience in using Snowflake Clone and Time Travel. In-depth knowledge of Snowflake Database, Schema and Table structures. Build the Logical and Physical data model for snowflake as per the changes required. Collaborative approach in working with data analysts and stakeholders to implement appropriate data models and structures. Strong expertise in optimizing Spark jobs and leveraging Azure Synapse Analytics for big data processing and analytics. Proven track record in performance optimization and capacity planning to ensure scalability and efficiency. Experienced in developing CI/CD frameworks for data pipelines and collaborating with DevOps teams for automated pipeline deployment. Proficient in scripting languages such as Python and Scala. Have a good working experience in Hadoop, HDFS, Map-Reduce, Hive, Python, PySpark. Experience in using Apache Sqoop to import and export data from HDFS and Hive. Integrated Tableau with the existing data infrastructure to enable seamless reporting and analytics on the processed and transformed data from Azure Data Factory, Databricks, Snowflake, and other sources. Adapted Terraform configurations for different cloud providers and ensured compliance with regulatory requirements. Technical skills Azure Services Azure Data Factory, Azure Data Bricks, Snowflake, Logic Apps, Functional App, Snowflake, Azure DevOps AWS Services Glue, S3, EMR, SQS, SNS, Lambda, Athena, Redshift, QuickSight Big Data Technologies MapReduce, Hive, Python, PySpark, Scala, Kafka, Spark streaming, Oozie, Sqoop, Zookeeper Languages SQL, Python, PL/SQL, HiveQL, Scala. Web Technologies HTML, CSS, JavaScript, XML, JSP, Restful, SOAP Operating Systems Windows (XP/7/8/10), UNIX, LINUX, UBUNTU, CENTOS. Version Control GIT, GitHub, Bitbucket. IDE &Build Tools, Design Eclipse, Visual Studio, Tableau Databases MS SQL Server 2016/2014/2012, Azure SQL DB, Azure Synapse. MS Excel, MS Access, Oracle 11g/12c, Cosmos DB Education Master s in Computer Science, University of Missouri Kansas City Bachelor s in Information Technology, Bharat Institute of Engineering and Technology Work Experience Role: Cloud Data Engineer II MAY 2024 TILL DATE Client: McKinsey & Company, Dallas, TX. Responsibilities: Designed and implemented scalable data ingestion pipelines using AWS Glue and AWS Lambda, ingesting data from various sources such as SQL databases, CSV files, and REST APIs. Developed data processing workflows using Amazon EMR, leveraging Spark for distributed data processing and transformation tasks. Ensured data quality and integrity by performing data validation, cleansing, and transformation operations using AWS Glue and EMR. Designed and implemented a cloud-based data warehouse solution using Snowflake on AWS, leveraging its scalability and performance capabilities. Created and optimized Snowflake schemas, tables, and views to support efficient data storage and retrieval for analytics and reporting purposes. Collaborated with data analysts and business stakeholders to understand their requirements and implemented appropriate data models and structures in Snowflake. Configured event-based triggers and scheduling mechanisms to automate data pipelines and workflows using AWS Step Functions and Amazon EventBridge. Implemented data lineage and metadata management solutions to track and monitor data flow and transformations. Implemented partitioning, indexing, and caching strategies in Snowflake and AWS services to enhance query performance and reduce processing time. Utilized Tableau to create interactive and insightful dashboards, providing a user-friendly interface for data analysts and business stakeholders to explore and visualize key metrics and trends. Worked with JIRA to report on projects, creating subtasks for development, QA, and partner validation, ensuring efficient project management and collaboration across teams. Environment: AWS Glue, AWS Lambda, Amazon EMR, Snowflake, Logic Apps, Snowflake, MS SQL, Oracle, HDFS, MapReduce, YARN, Rest API, Pandas, Spark, Hive, SQL, Python, Scala, PySpark, shell scripting, GIT, JIRA, Jenkins, Kafka, Airflow, Tableau. Role: Data Engineer AUG 2022 APR 2024 Client: Goldman Sachs, Dallas, TX. Responsibilities: Responsible for Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, data bricks, Pyspark, Spark SQL and U-SQL Azure Data Lake Analytics. Created Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform and load data from Azure SQL, Blob storage, and Azure SQL Data warehouse. Worked with Azure BLOB and Data Lake storage and loading data into Azure SQL Synapse analytics (DW). Involved in developing data ingestion pipelines on Azure HDInsight Spark cluster using Azure Data Factory and Spark SQL. Designed and implemented a real-time data streaming solution using Azure EventHub. Conducted performance tuning and optimization activities to ensure optimal performance of Azure Logic Apps and associated data processing pipelines. Point person from OMS team for all data migration related tasks. Develop a Spark Streaming application to process real-time data from various sources such as Kafka, and Azure Event Hubs. Build streaming ETL pipelines using Spark Streaming to extract data from various sources, transform it in real-time, and load it into a data warehouse such as Azure Synapse Analytics Use tools such as Azure Databricks or HDInsight to scale out the Spark Streaming cluster as needed. Used Jira for bug tracking and Bitbucket to check-in and checkout code changes. Experienced in version control tools like GIT and ticket tracking platforms like JIRA. Environment: Azure, Hadoop, HDFS, Yarn, MapReduce, Hive, Sqoop, Oozie, Kafka, SparkSQL, Spark Streaming, Eclipse, Informatica, Oracle, CI/CD, PL/SQL UNIX Shell Scripting, Cloudera. Role: Data Engineer/Data Analyst MAY 2018 DEC 2021 Client: Amazon, Hyderabad, India. Responsibilities: Developed and maintained pipelines for extracting behavior data, transaction history, and customer account, ensuring efficient and accurate data extraction processes. Implemented data cleansing and transformation processes to enhance data quality and accuracy, contributing to a 20% reduction in fraudulent activities. Collaborated closely with cross-functional teams, engaging data scientists and product managers, to define and align the real-time fraud detection system with business objectives, ensuring ongoing enhancement in detection accuracy and the system's efficacy. Utilized Snowflake's robust data extraction capabilities to efficiently extract and load large volumes of data into Power BI, enabling timely and accurate reporting. Environment: Role: Software-Engineer Intern NOV 2017 APR 2018 Client: Vasudhaika Technologies, Hyderabad, India. Responsibilities: Designed and implemented key features such as product catalog, shopping cart, and secure payment processing, resulting in a 20% increase in user engagement. Successfully integrated RESTful APIs into the front-end, enabling smooth communication between the web application and the server-side components of Kalgudi. Environment: Python, HTML, CSS, JavaScript, Bootstrap, ReactJS Keywords: continuous integration continuous deployment quality analyst business intelligence sthree database information technology microsoft procedural language Texas |