Home

Sai Lakshmi - Data Analyst
[email protected]
Location: Bordentown, New Jersey, USA
Relocation:
Visa: GC
Name: Sai Lakshmi
Email.Id: [email protected]
Ph. No: +1 (201) 609-8006
PROFESSIONAL SUMMARY:
10+ years of experience as Data Analyst with functional and industry experience with accomplished process and project
responsibility such as Data analysis, design, development, user acceptance & performance management.
Strong professional experience with emphasis on Analysis, design, development, testing, maintenance and implementation
of Data Mapping, Data Validation, and Requirement gathering in Data warehousing Environment.
Experience in data warehousing applications using ETL tools and programming languages like python, R, Java,Scala,
Matlab,SQL/PLSQL, Oracle and SQL Server Database And SSIS.
Experience in handling huge set of data using cloud clusters like Amazon Web Services (AWS), MS Azure,Amazon,
Redshift, Hadoop and archiving the data.
Performed Data analysis and Data profiling using complex SQL on various sources systems including Oracle and Teradata.
Experience in providing custom solutions like Eligibility criteria, Match and Basic contribution calculations for major
clients using Informatica and reports using Power BI, Tableau, Looker, QlikView.
Extensively used Python Libraries PySpark, Pytest, Pymongo, cx_Oracle, PyExcel, Boto3, Psycopg, embedPy, NumPy
and Beautiful Soup.
Experience in Data Analysis, Data Profiling, Data Migration, Data Integration and validation of data across all the
integration points.
Familiarity with ETL (Extract, Transform, Load) processes, tools like Apache NiFi, Apache Airflow, or dbt (Data Build
Tool) are crucial for working with data pipelines.
Built predictive models using Google AutoML, H2O.ai and TPOT achieving increased revenue and improved accuracy.
Familiarity with GDPR and CCPA regulations, and ensuring data privacy is maintained, is becoming increasingly
important.
Extensive experience on usage of ETL & Reporting tools like SQL Server Integration Services (SSIS), SQL Server Reporting
Services (SSRS).
Experienced in Big Data technologies including Apache Hadoop and Apache Spark With expertise in data extraction and
exploratory analysis.
Implemented Scikit-learn, PyTorch, Keras, and TensorFlow for machine learning, developing predictive models to
forecast waste generation patterns and optimize resource allocation.
Designed and developed weekly, monthly reports by using MS Excel Techniques (Quip, Zoho Sheet or WPSSpreadsheets.
Charts, Graphs, Pivot tables) and Power point presentations
Experience with Data analysis and SQL querying with database integrations and data warehousing for large financial
organizations.
Strong working experience in Data Cleaning, Data Warehousing and Data Massaging using PythonLibraries and MySQL.
Experience in creating Ad-hoc reports, data driven subscription reports by using SQL.
Expertise in Power BI, Power BI Pro, Power BI Mobile. Expert in creating and developing Power BI Dashboards.
Experienced in RDBMS such as Oracle, MySQL and IBM DB2 databases.
Hands on experience in complex querying writing, query optimizing in relation Databases including Oracle, T-SQL,
Teradata, SQL Server and Python.
Experienced in business requirements collection methods using Agile, Scrum and Waterfall methods and software
development life cycle (SDLC) testing methodologies, disciplines, tasks, resources and scheduling.
Extensive knowledge on Data Profiling using Informatica Developer 9.x/8.6.0/8.1.1/7. x/6.x tool.
Understanding version control systems like Git is becoming important for collaborating on data-related projects,
especially in team settings.
TECHNICAL SKILLS:
Languages SAS, SQL, Python, R, Java, Scala, Matlab
BI Tools Tableau, Microsoft Power BI, PowerPivot
Data Warehousing Tools Talend, SSIS, SSAS, SSRS, Toad Data Modeller
Python Libraries Scikit Learn, Pandas, Numpy, Scipy, Matplotlib, Seaborn, Plotly
Data Visualization Tableau, Microsoft Power BI
ETL Informatica PowerCenter, SSIS
Machine learning models scikit-learn and TensorFlow
Microsoft Tools Microsoft Office, MS Project
Database Tools SQL server, MySQL, MS Excel, PostgreSQL, SQLite, MongoDB
Data Analysis Web Scraping, Statistical Modelling, Hypothesis testing, Predictive Modelling
Data Mining Algorithms Decision Trees, Clustering, Random Forest, Regression

CERTIFICATIONS
Google Data Analytics Professional Certificate-Coursera
Microsoft Certified: Power BI Data Analyst Associate
PROFESSIONAL EXPERIENCE:
Client: Southwest Airlines, Dallas TX April 2023 - Present
Role: Data Analyst
Responsibilities:
Implemented and followed Agile development methodology within the cross-functional team and acted as a liaison
between the business user group and the technical team.
Analyzing and validating large datasets to support ad-hoc analysis, reporting, and remediation using SAS.
Processing data from tools such as Snowflake and writing complex queries in SQL or SAS using complex joins,
sub-queries, Table creation, Aggregation, and using concepts of DLL, DQL and DML.
Perform ETL (Extract Transform Load) using tools such as Informatica, azure to integrate/transform disparate data
sources.
Perform data scraping, data cleaning, data analysis, and data interpretation and generate meaningful reports using Python
libraries like pandas, matplotlib, etc.
Adept at leveraging cloud platforms such as AWS, Azure, and Google Cloud to support scalable data storage and analytics
solutions.
Generate weekly reports with visualization using tools such as MS EXCEL (Pivot tables and macros), and Tableau to
enable business decisions making.
Performed Data analysis, statistical analysis, generated reports, listings and graphs using SAS Tools
SAS/Base,SAS/Macros and SAS/Graph, SAS/SQL, SAS/Connect, SAS/Access.
Generate various dashboards and created calculated fields in Tableau for data intelligence and analysis based on the
business requirement.
Played a pivotal role in the selection and utilization of ML frameworks including PyTorch, TensorFlow, andScikit-learn,
aligning technology choices with project requirements.
Applied advanced analytical techniques to solve business problems that are typically medium to large scale with impact to
current and/or future business strategy.
Applied innovative and scientific/quantitative analytical approaches to draw conclusions and make 'insight to action'
recommendations to answer the business objective and drive the appropriate change.
Translated recommendation into communication materials to effectively present to colleagues for peer review and
mid-to-upper-level management.
Familiarity with Spark SQL and PySpark (for Python) can be a game changer when working with large-scale data
analysis.
Incorporated visualization techniques to support the relevant points of the analysis and ease the understanding for less
technical audiences.
Used Power BI Desktop to develop data analysis multiple data sources to visualize the reports.
Identified and gathered the relevant and quality data sources required to fully answer and address the problem for the
recommended strategy through testing or exploratory data analysis (EDA).
Transformed disparate data sources and determines the appropriate data hygiene techniques to apply.
Understands and adopts emerging technology that can affect the application of scientific methodologies and/or
quantitative analytical approaches to problem resolutions.
Delivers analysis/findings in a manner that conveys understanding, influences mid to upper-level management, garners
support for recommendations, drives business decisions, and influences business strategy.
Environment: SAS, Dremio, Snowflake, Tableau, Pandas, NumPy, seaborn, SciPy, Matplotlib, PowerBI, T-SQL, MS SQL Server,
MS Excel, UML, MS Visio, ETL - SSIS, SSRS, Data Modelling - Star Schema,SnowFlake Schema.
Client: JME Insurance, Dallas, TX Nov 2021 Mar 2023
Role: Data Analyst
Responsibilities:
Collaborated with stakeholders and cross-functional teams to elicit, elaborate and capture functional and non-functional
solutions.
Translated business requirements to technical requirement and did data modeling as per the technical requirement.
Performed analytical modeling, database design, data analysis, regression analysis, data integrity, and business analytics.
Created Data Mapping between the source and the target system. Created documentation to map source and target tables
columns and datatypes.

Skilled in machine learning algorithms for anomaly detection and predictive analytics, leveraging frameworks like
PyODand XGBoost.
Experienced in conducting geographic analysis using geospatial analysis frameworks like GeoPandas and Folium.
Built Text Analytics, generating data visualizations using R, Python and creating dashboards using tools like Tableau,
Power BI. Strong experience in migrating other databases to Snowflake.
Performing basic DML and DDL skills like writing subqueries, window function and CTE.
WroteSQL Queries using insert, update, delete statements and exported data in the form of csv, xml, txt etc.
Also, wrote SQL queries that included joins like inner, outer, left, right, self-join in SQL Server.
Summarized data from a sample using indexes such as mean or standard deviation and performed linear regression.
Compared different WFHM DB environments and determined, resolved and documented discrepancies
Involved in ETL development, creating required mappings for the data flow using SSIS.
Generating various capacity planning reports (graphical) using Python packages like NumPy, matplotlib, SciPy.
Designed ETL flows to implement using Informatica Power Center as per the mapping sheets provided.
Optimization of data sources for route distribution analytics dashboard in PowerBI report runtime.
Involved in developing UML use-case diagrams, Class diagrams, and diagrams using MS Visio.
Worked on predictive analytics use-cases using Python language.
Conducted A/B testing and statistical analysis using frameworks like SciPyStats and Models to evaluate the effectiveness of
marketing campaigns and product features.
Developed serverless data processing pipelines usingAWS Lambda functions data workflows and integrating with API
Gateway for event-driven processing.
Manage Departmental Reporting systems, troubleshooting daily issues, and integrating existing Access databases with
numerous external data sources including (SQL, Excel, & Access).
Utilized PowerBI and custom SQL features to create dashboards and identify correlation
Prepared Scripts in R and Shell for Automation of administration tasks.
Developed and implemented data governance frameworks and policies to ensure data quality and compliance with
regulatory requirements such as GDPR and CCPA.
Wrote several Teradata SQL Queries using SQL Assistant for Ad Hoc Data Pull request.
Extracting the source data from Oracle tables, MS SQL Server, sequential files and excel sheets.
Created Data Quality Scripts using SQL to validate successful data load and the quality of the data. Created various types of
data visualizations using R and PowerBI.
Performed data analysis and data profiling using complex SQL on various sources systems.
Categorizing and generating a report on the multiple parameters using MS Excel, Power BI.
Worked on logical and physical modeling of various data marts as well asDW/BI architecture using Teradata.
Environment: Matplotlib, PowerBI, T-SQL, MS SQL Server, MS Excel, UML, MS Visio, ETL - SSIS, SSRS, Data Modelling - Star
Schema, Snowflake Schema, Shell Script, Jira, Git, Teradata, Workflows.
Client: Zensar Technologies,India Oct 2018- Dec 2020
Role: Data Analyst
Responsibilities:
Involved in requirements gathering, Analysis, Design, Development, testing production of application using SDLC - Agile/
Scrum model.
Worked on the entire Data Analysis project life cycle and actively involved in all the phases including data cleaning, data
extraction and data visualization with large data sets of structured and unstructured data, created ER diagrams and
schema.
Advanced knowledge, especially libraries like NumPy, pandas, scikit-learn, TensorFlow, PyTorch, and PyCaret.
Also, manipulated the data for benefit related for lab related work and clinic services etc. by writingSQL queries that
included joins like inner, outer, left, right, self-join in SQL Server and exported the data in the form of csv, txt, XML etc.
Summarized data from a sample using indexes such as mean or standard deviation and performed linear regression.
Worked on Data Warehousing principles like Fact Tables, Dimensional Tables, Dimensional Data Modelling - Star Schema
and SnowFlake Schema.
Expertise in data manipulation, statistical analysis, and visualization using tools like dplyr, ggplot2, and Shiny.
Created Adhoc reports to users in Tableau by connecting various data sources.
Used excel sheet, flat files, CSV files to generated TableauAdhoc reports.
Involved in defining the source to target data mappings, business rules, business and data definitions
Worked closely with stakeholders and subject matter experts to elicit and gather business data requirements.
Used Pandas, NumPy, seaborn, SciPy, Matplotlib in Python for developing various machine-learning algorithms and
utilized machine learning algorithms such as linear regression, multivariate regression for data analysis.
Work with business analyst groups to ascertain their database reporting needs.
Created database using MongoDB, wrote several queries to extract data from database.
Wrote scripts in Python for extracting data from HTML file.

Worked with connecting the databases from PostgreSQL to python.
Tracked Velocity, Capacity, Burn down Charts, and other metrics during iterations.Created Data flow diagrams.
Using R automated a process to extract data and various document types from a website, save the documents to specified
file path, and upload documents into an excel template. Performed data analysis and data profiling using SQL on various
source systems including Oracle and Teradata.
Utilized SSIS ETL toolset to analyze legacy data for data profiling
Utilized PowerBI reporting to create, test and validate various visualization, reports (ad-hoc), dashboards, and KPI s.
Designed and published visually rich and intuitively interactive PowerBI/Excel workbooks and dashboards for executive
decision making.
Orchestrated big data processing workflows using EMR clusters, leveraging frameworks like Apache Spark and Hadoop
For distributed data processing.
Developed, Streamlined CRM database and built SQL queries for data analysis of 1 million + records.
Generated New Market and Investment Banking reports by using SSRS and increased the efficiency by 50%.
Introduced Power BI, designed dashboards for time-based data and improved performance by 40%.
Build ETL workflows for automated reporting of Investment Banking data and reduceD the workload by 40% using SSIS.
Environment:Tableau, SQL server,NumPy, seaborn, SciPy, Matplotlib, Python, SDLC - gathering, Analysis, Design,
Development, testing, Agile/ Scrum, Data Warehouse, MongoDB, PostgreSQL, Oracle, Teradata, Informatica - Informatica Data
Explorer, and Informatica Data Quality, ETL, Data Modelling - Star Schema,SnowFlake Schema, KPI.
Company: HSBC, India Nov 2016 - Sep 2018
Role: Data Analyst
Responsibilities:
Generated energy consumption reports using SSRS, which showed the trend over day, month and year.
Performed ad-hoc analysis and data extraction to resolve 20% of the critical business issues.
Well versed in Agile SCRUM development methodology, used in day-to-day work in application for Building Automation
Systems (BAS) development.
Weekly, monthly and Quarterly insight reporting utilizing Excel, Tableau and SQL database about pricing trend and
opportunities.
Streamlining and automating Excel/ Tableau dashboards for improved speed utilization through Python and SQL based
solutions.
familiarity withCloud-native analytics tools like AWS Quick Sight, Azure Synapse Analytics, and Google Data Studio.
Designed creative dashboards, storylines for dataset of a fashion store by using Tableau features.
Developed SSIS packages for extract/load/transformation of source data into a DW/BI architecture/OLAP cubes as per the
functional/technical design and conforming to the data mapping/transformation rules.
Developed data cleaning strategies in Excel (multilayer fuzzy match) and SQL (automated typo detection and correction)
to organize alternative datasets daily to produce consistent and high-quality reports.
Created views to facilitate easy user interface implementation, and triggers on them to facilitate consistent data entry into
the database.
Involved in Data Analysis and Data Validation by extracting records from multiple databases using SQL in Oracle SQL
Developer tool.
Understanding SQL-based querying engines for big data platforms like Apache Hive, Impala, and Presto is also becoming
important.
Identified the data source and defining them to build the data source views.
Involved in designing the ETL specification documents like the Mapping document (source to target).
Used ETL (SSIS) to develop jobs for extracting, cleaning, transforming and loading data into data warehouse.
Created Stored Procedures and executed the stored procedure manually before calling it in the SSIS package creation
process.
Written SQL test scripts to validate data for different test cases and test scenarios
Created SSIS Packages to export and import data from CSV files, Text files and Excel Spreadsheets.
Performed data manipulation - inserting, updating, and deleting data from data sets
Developed various stored procedures for the data retrieval from the database and generating different types of reports
using SQL reporting services (SSRS).
Environment: Windows, SDLC-Agile/Scrum, SQL Server, SSIS, SSAS, SSRS, ETL, PL/SQL, Tableau, Excel, CSV Files, Text Files,
OLAP, Data Warehouse, SQL - join, inner join, outer join, and self-joins.
Client:Sagar soft Pvt Limited, India Feb 2013 - Oct 2016
Role: Data Analyst
Responsibilities:

Evaluated new applications and identified system requirements.
Visualized KPI metrics like resource utilization, net profit margin, gross profit margin and burn rate using Tableau.
Worked on time series analysis using Pandas to identify patterns on how asset variable changes which in turn helped
project completion by 70%.
Conducted data extraction, transformation, and loading (ETL) processes using tools like Apache NiFi and Talend to
ingest healthcare data from disparate sources.
Designed and implemented data models using tools like Erwin and SQL Server Management Studio to ensure efficient
storage and retrieval of banking data.
Recommended solutions to increase revenue reduce expense; maximize operational efficiency, quality, compliance, etc.
Identified business requirements and analytical needs from potential data sources.
Performed SQL validation to verify the data extracts integrity and record counts in the database tables
Worked with ETL developers for testing, mapping data and aware of data models to translate and migrate data.
Created Requirements Traceability Matrix (RTMs) using Rational Requisite Pro to ensure complete requirements coverage
with reference to low level design document and test cases.
Assisted the Project Manager to develop both high-level and detailed application architecture to meet user requests and
business needs. Also, assisted on project expectations and in evaluating the impact of changes on the project plans
accordingly and conducted project related presentations and in performing Risk Assessment, Management and Mitigation.
Collaborated with different teams to analyze, investigate and diagnose root cause of problems and publish root cause
analysis report (RCA).
Achieved in using advanced SQL queries and analytic functions for date calculations, cumulative distribution and NTILE
calculations.
Used advanced Excel formulas and functions like Pivot Tables, Lookup, If with and/index, match for data cleaning.
Environment:SQL, ETL, Mapping data,Tableau, NTILE, RCA, RTMs,Pivot Tables, KPI metrics.
Keywords: artificial intelligence machine learning business intelligence database active directory rlang information technology microsoft procedural language Idaho Texas

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)
[email protected];4861
Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: