Home

Sumanth K - Senior Business Intelligence Engineer
[email protected]
Location: New York City, New York, USA
Relocation: Yes
Visa: H1B
Resume file: Sumanth Kothapalli Resume_1781106849350.docx
Please check the file(s) for viruses. Files are checked manually and then made available for download.
Sai Sumanth Kothapalli

Professional Summary
Over 7 years of strong experience in data engineering, data analysis, data mining with large data sets of structured and unstructured data, data acquisition, data validation, data modeling, data warehousing and data visualizations.
Adept in statistical programming languages like R and Python, Apache Spark. Good knowledge of distributed data processing.
Experienced on data architecture including data ingestion pipeline design, data extractions, data transformations, data modeling, data mining, machine learning and advanced data processing.
Experience in developing complex ETL tasks and pipelines for data updates, data cleansing, data migration and data warehousing.
Experience in SQL to extract data from a variety of operational data sources on multiple platforms and build a data warehouse and data marts that integrate the extracted data for cross-functional analytics and reporting requirements.
Experience in designing the data conversion strategy, development of data mappings and the design of Extraction, Transformation and Load (ETL) routines for migrating data from non-relational or source relational to target relational.
Strong experience in design and development of Business Intelligence solutions using data modeling, dimension modeling, ETL processes, data integration, OLAP and OLTP.
Experienced in Azure Cloud services like Azure Data Lake Store, Azure Data Factory, Azure Synapse Analytics and Microsoft Fabric for deploying data architecture, ETL pipelines and dataflows.
Excellent knowledge of Amazon Web Services (AWS) products like S3, Glue, Athena, EC2, EMR, Aurora, Redshift, RDS.
Experience with Snowflake Multi-cluster Warehouses, Snowflake Virtual Warehouses and building Snowpipe.
Hands on experience with Azure Data Factory & AWS Glue, monitoring datasets, developing activities, dataflows and ETL pipelines, and performing data transformations in ADF & Glue.
Worked extensively with dimensional modeling, data migration, data cleansing and data profiling for data warehouses.
Experienced in Dimensional Data Modeling experience using data modeling, relational data modeling, star schema/snowflake schema modeling, fact & dimension tables, conceptual, physical & logical data modeling.
Hands on experience in formatting and ETL raw data in various formats such as CSV, JSON, Parquet, XML etc.
Extensive experience working with Business Intelligence data visualization tools with specialization on Power BI (Power BI Desktop, Power BI Service) and Tableau (Tableau Desktop, Tableau Server).
Strong experience and knowledge in data visualizations creating line and scatter plots, bar Charts, histograms, pie & donut charts, box plots, time series, multiple chart types, tables, matrices and drill-down charts.
Strong DAX programming background for creating report measures and columns supporting the reporting functions.
Excellent knowledge of machine learning, statistical modeling and data science life cycle.
Experienced in using Python libraries like numpy, pandas, scikit-learn for data analysis, data profiling, and model building.
Deep analytics and understanding of Big Data and algorithms using Hadoop, MapReduce, Spark and distributed computing tools.
Expertise in managing entire data science project life cycle and actively involved in all the phases of project life cycle including data acquisition, data cleaning, data engineering, features scaling, features engineering, statistical modeling (decision trees, regression models, clustering, classification, ensemble models), dimensionality reduction using Principal Component Analysis.
Experienced with MSBI services like Integration Service (SSIS), Reporting Service (SSRS) and Analysis Services (SSAS).
Well-versed in version control and CI-CD tools such as Git, Gitlab, SourceTree etc.
Experience in all stages of SDLC (Agile, Waterfall), writing Technical Design document, Development, Testing and Implementation of Enterprise level Data mart and Data warehouses.
Expertise in COTS software like Salesforce CRM, Hexagon EAM, SAP EAM and deploying them to migrate processes to Enterprise systems.

Technical Skills:

Business Intelligence Power BI, SSRS, SSAS, Tableau, SAP Analytics Cloud, Looker, Microsoft Fabric
Databases/Data Warehousing SQL Server, MySQL, Oracle 11g, Azure SQL Database, AWS Aurora, Snowflake, Azure Synapse Analytics, AWS Redshift.
Cloud Environment Amazon Web Services (AWS), Microsoft Azure, GCP, Oracle Cloud
Big Data Ecosystem HDFS, Map Reduce, Spark, Scala, Kafka, AWS- EC2, S3, EMR.
Other Tools HxGN EAM, Power pivot, Git, Jira, Confluence, CI-CD, Jupyter Notebooks, SAP BusinessObjects, Power Query, SharePoint, Power Apps, Power Automate
Programming languages C, SQL, MySQL, PL/SQL, T-SQL, R, Python, JavaScript
ETL Tools Alteryx, Talend Studio, Apache Nifi, AWS Glue, Azure Data Factory (ADF), Informatica





Professional Experience:

Volantsoft Inc, New York, NY
Client: MTA
Senior Data Engineer/BI Developer Jan 2021 Present

Responsibilities:
Created & analyzed business requirements to compose functional and implementable technical data solutions & architecture.
Collaborated with cross-functional teams to define project requirements and deliver data-driven solutions.
Migrated data from several on-premises source systems to cloud data storage and cloud data warehouse with low downtime.
Developed and monitored ETL pipelines in Azure Data Factory (ADF) for data ingestion from various sources into Azure storage and then to Azure Synapse Warehouse.
Migrated data from legacy data systems into HxGN EAM and then to Azure warehouse for BI and analytics. Performed Incremental Loading from EAM Oracle database to Azure Warehouse to keep the data up to date and relevant.
Utilized industry standards for partitioning data in the data warehouse into buckets using techniques like value based, range-based partitioning.
Wrote advanced SQL queries for extracting data from relational databases, created views, summary tables, data aggregation.
Wrote T-SQL (DDL and DML) queries, stored procedures, triggers to handle data tuples based on functional requirements.
Developed activities and data flows in ADF to perform transformations like data aggregations, data cleansing, data deduplication, data derivation (creating new columns), data filtering, data integration, data joining and data splitting.
Implemented Increment Loading in ETL jobs using CDC to optimize compute requirements for data loading and refreshing.
Implemented data validation techniques for ETL jobs that moved data from source systems to target data warehouse which improved data quality, data integrity and data richness to be used in reports and analytics for user consumption.
Developed data marts from warehouse for various cross-functional teams to help them with operations and analytics.
Integrated several third-party systems with Azure cloud for improving scalability and performance of the data reads & writes.
Gathered data from various external sources (Oracle, SharePoint, Salesforce, Azure Date Lake, etc.) into Power BI service using various connectors and developed dataflows for consumption in Power BI reports and dashboards.
Utilized Power Query editor in Power BI and performed certain operations like fetching data & transforming data. Developed Power BI reports, semantic models from OneLake workspaces using Microsoft Fabric.
Utilized Microsoft Copilot to summarize Power BI reports, semantic models and to also create DAX queries & measures.
Developed various solution driven views and dashboards by developing different visuals including Pie Charts, Bar Charts, Tree Maps, Line Charts, Tables, Matrices in Microsoft Power BI.
Utilized DAX extensively to create new calculated measures, calculated columns, conditional columns and transformed data to be utilized in Power BI dashboards and reports.
Deployed Power BI Apps on Power BI Service. Deployed and maintained Workspaces architecture in Power BI Service. Created several user roles and groups to the end users and provided Row Level Security (RLS) to them.
Developed several Bookmarks and Drill-throughs and enabled seamless application interactions in Power BI. Worked with different levels of filters like report level, visual level filters and page level filters.
Developed several Dataflows in Power BI service using various connectors to create semantic models for dashboards.
Configured incremental refresh, scheduled refresh for the datasets, dataflows & semantic models in Power BI service natively and automated through Power Automate flows.
Developed Power BI deployment pipelines to move the reports among Testing and Production workspaces.
Troubleshooted and supported reports in Power BI workspaces whenever there are issues in Production.
Developed CI-CD pipeline to automate build and deploy to Dev, QA, and production environments.
Developed Power Automate flows to integrate SharePoint, Power BI and Power Apps and deployed automated solutions and maintained the ecosystem of applications.
Worked on technical documentation of ETL process and design documents for each module.
Created high- and low-level design document specifications for source-target mapping, based on transformation rules. Developed data dictionaries and data mapping tables, documentation for source to target tables.
Led custom development Sprints in Agile/Scrum framework as the point of contact between the business team, and in-house development, configuration, deployment, and QA teams.
Managed and collaborated with data analysts and delivered several projects which involved design, development, testing, deployment, and sustainment phases.
Environment: Microsoft Power BI, Microsoft Fabric, Copilot, Tableau, Excel, SQL, PySpark HxGN EAM, Snowflake, Python, Oracle, SQL Server, Azure Data Factory, ETL, T-SQL, JSON, XML, Git, Jira, Confluence, SharePoint, Power Automate, Power Apps.



PDX, Fort Worth, TX
Sr. Data Analytics Engineer Jan 2019 Dec 2020

Responsibilities:
Involved in requirements gathering, analysis, design, development, change management, deployment.
Performed Data Integration, Extraction, Transformation, and Load (ETL) Processes in preprocessing phase, used Pandas to remove or replace all the missing data.
Processed large amounts of data through a complex grouping and data manipulation process to produce benchmarks used in decision support system.
Worked on ER Modeling, Dimensional Modeling (Star Schema, Snowflake Schema), Data warehousing and OLAP tools.
Developed Spark jobs in PySpark to perform ETL from SQL Server to Hadoop.
Wrote SQL queries to extract data from one system and migrate to another system which involved ETL of the data.
Developed SQL procedures and triggers to insert/update data tuples on various tables in the database which served as the source for several Power BI dashboards and reports.
Wrote DAX queries and configured Power Automate flows to send automated reports/emails to business user groups for daily, weekly and monthly view of the operational data.
Developed several Dataflows in Power BI service using connectors to be utilized in Power BI semantic models for dashboards.
Configured incremental refresh, scheduled refresh for the datasets, dataflows & semantic models in Power BI service natively and automated through Power Automate flows.
Translated recommendations into communication materials to effectively present to colleagues and mid-to-upper-level management.
Utilized Pivot Tables, Index-Matching, VLOOKUP in Excel to analyze data and creating reports.
Hands on development assisting users in creating and modifying worksheets and data visualization dashboards. Created and implemented interactive charts, graphs, and other user interface elements using Power BI.
Developed story telling dashboards in Power BI Desktop and published them on to Power BI Service, which allowed end users to understand the data on the fly with the usage of bookmarks, quick filters for on demand needed information.
Developed CI-CD pipeline to automate build and deploy to Dev, QA, and production environments.
Working knowledge of build automation and CI/CD pipelines.
Developed python scripts to automate data ingestion pipeline for multiple data sources into Azure Data Lake Storage (ADLS).

Environment: SQL, Python, Hadoop, Spark, Power BI, Excel, Snowflake, PySpark, SQL Server 2012, T-SQL, CI-CD, Git, XML, Jira.

Education:
Master of Science in Computer Science August 2020
University of Minnesota, Duluth
Bachelor of Technology in Computer Science May 2018
Jawaharlal Nehru Technological University, India
Keywords: cprogramm continuous integration continuous deployment quality analyst business intelligence sthree rlang procedural language New York Texas

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)
[email protected];7423
Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: