Home

Kartheek D - Azure Data Engineer
[email protected]
Location: Plano, Texas, USA
Relocation: Yes I can Relocate to Client Location
Visa: Green Card
KARTHEEK D
Senior Azure Data Engineer
Phone: (901)-290-3522
Email: [email protected]
LinkedIn: https://www.linkedin.com/in/kartheek-d/
OBJECTIVE:
Accomplished Azure data engineer with over 10+ years of experience seeking to leverage expertise in end-to-end Azure
analytics implementations from conceptual design to production operations. Core skills encompass building big data pipelines,
ETL workflows with WhareScape Red, Azure Data Factor, and Informatica, and data warehouses leveraging services like Azure
Databricks, Synapse Analytics, and Data Lake Storage. Played significant roles in Azure cloud solutions, handling data
ingestion, optimizing Spark workloads, enforcing data governance standards, and leading documentation efforts. Passionate
about joining a dynamic team utilizing advanced Azure platform capabilities to power data-driven business growth.
PROFESSIONAL SUMMARY:
Experienced Data Professional with 10+ years in IT, specializing in Microsoft Azure services like Azure Data Factory (ADF),
Databricks, Cosmos DB, and Azure Data Lake Storage Gen 2 (ADLS Gen2).
Proficient in big data technologies for ETL, including MapReduce, Hive, YARN, Apache Spark, and HDFS.
Accomplished data expert with a comprehensive background in managing the entire ETL (Extract, Transform, Load) data flow
process, ensuring flexibility and seamless performance.
Utilized Azure Blob Storage, Azure SQL Database, and Azure Cosmos DB in ETL workflows for smooth data integration across
Azure data services.
Experience in transitioning SQL databases to Azure Data Lake Storage (ADLS Gen 2), utilizing Azure Data Lake Analytics,
Azure SQL Database, Azure Databricks, and Azure Synapse Analytics.
Configured and managed Synapse SQL pools for scalable data storage and high-performance analytics.
Improved data security and compliance by leveraging Azure Databricks embedded authentication and authorization features.
Skilled in managing database access control and migrating on-premises databases to Azure Data Lake Storage (ADLS Gen 2)
Store using Azure Data Factory (ADF) for improved data integration and analytics.
Developed efficient ETL transformations and robust validation procedures utilizing Spark SQL and Data Frames in Azure
Databricks and Azure Data Factory (ADF) environments.
Designed and deployed scalable microservices architecture using Azure Service Fabric for high availability and fault tolerance.
Successfully integrated on-premises and cloud data platforms using Azure Data Factory (ADF), executing intricate transformations,
and enhancing data loading efficiency into Snowflake.
Snowflake's shared-nothing architecture, micro-partitioning, and unique features facilitate scalable and efficient data storage and
retrieval, ensuring proficiency in its architecture.
Proficient in utilizing Cosmos DB in Azure, adept at designing, implementing, and managing globally distributed NoSQL databases
for scalable, high-performance applications.
Proficient in establishing CI/CD pipelines using Bitbucket and Azure DevOps, automating build, test, and deployment procedures
to enhance software delivery efficiency, scalability, and reliability.
Actively collaborate with Azure Logic Apps administrators and DevOps engineers to swiftly monitor and resolve automation and
data pipeline issues, fostering seamless teamwork.
Proficient in integrating SSIS with Azure Data Factory (ADF) for cloud-based data workflow orchestration, utilizing Azure's
scalability and advanced integration services.
Implemented and launched an advanced data catalog, greatly enhancing data discoverability and governance across the
organization, thereby transforming data management and accessibility.
Proficient in crafting, executing, and overseeing intricate data models in Salesforce, with adeptness in Salesforce Object Query
Language (SOQL) and Salesforce Object Search Language (SOSL).
Proficient in coordinating and managing data workflows using Control-M for task scheduling and automation, focusing on Azure
cloud integration.
Proficient in designing, developing, and maintaining data integration solutions for both Hadoop and RDBMS environments.
Proficient in constructing complete data pipelines, utilizing HDFS as the core storage, seamlessly integrating with Apache Kafka
for data ingestion and Apache Spark for processing and analysis of varied datasets.
Designed scalable HDFS architectures to optimize storage and processing for multi-petabyte datasets, achieving data
redundancy and high availability within distributed computing environments.
Experienced in crafting and managing ETL/ELT workflows utilizing Apache Spark, Apache Beam, and Apache Airflow to
streamline data extraction, transformation, and loading operations.
Proficient in utilizing ETL tools such as Apache Spark, Informatica, or custom scripts to load processed data into destination
systems.
Skilled in data modeling and schema design for consistent and reliable data in Hive and Spark.
Utilized T-SQL scripting to automate routine database tasks, thereby boosting productivity and minimizing manual effort in database
administration and maintenance.
Developed and enhanced SQL queries incorporating PARTITION BY and ORDER BY clauses to achieve precise data partitioning
and ordering within window functions.
Proficient in crafting and optimizing SQL queries encompassing Data Definition Language (DDL) and Data Manipulation
Language (DML) for efficient data management and retrieval.
Skilled in utilizing PySpark for extensive data processing and analytics, showcasing adeptness in API utilization and optimization
methods for streamlined data transformations and computations.
Demonstrated adeptness in Python and Scala scripting for optimized data processing, fostering streamlined workflows and
informed decision-making in the organization.
Proficient in essential Python libraries for data engineering, including Pandas for data manipulation, NumPy for numerical
computation, and PySpark for large-scale data processing.
Proficient in utilizing GIT for version control, ensuring efficient management of source code for multiple concurrent projects, and
enhancing team collaboration and code quality.
Experienced in utilizing JIRA for project reporting, task management, and ensuring efficient project execution within Agile
methodologies.
TECHNICAL SKILLS:
Cloud Services: Azure Data Factory, Azure Data Bricks, Snowflake, Logic Apps, Functional Apps,
Azure DevOps, Azure Event Hub, Azure Synapse analytics, Azure Data Lake,
PolyBase, Azure SQL Server.
Hadoop Distribution: Cloudera, Horton Works.
Big Data Technologies: HDFS, MapReduce, Hive, HBase, Yarn, Scala, Kafka, Spark streaming, Oozie,
Sqoop, Zookeeper, Pig, Flume.
Languages: SQL, PL/SQL, Python, HiveQL, Scala, PySpark.
Visualization Tools: Power BI, Tableau.
Operating Systems: Windows (XP/7/8/10), UNIX, LINUX, UBUNTU, CENTOS.
Build Automation Tools: Maven, SBT.
Version Control Tools & CI/CD: GIT, GitHub, Jenkins.
IDE &Build Tools, Design: Eclipse, Visual Studio.
Databases: MS SQL Server 2016/2014/2012, Azure SQL DB, Azure Synapse, MS Excel, MS
Access, Oracle 11g/12c, Cosmos DB.
File Formats: JSON, CSV, ORC, Parquet, Avro.
Miscellaneous: PowerShell, Kubernetes, Docker.
EDUCATION:
Bachelor s in computer science, Jawaharlal Nehru Technological University, India. (April 2011)
Master s in computer science and information systems, University of Memphis, US (December 2013)
CERTIFICATIONS:
AZ-900 Microsoft Azure Fundamentals
DP-203 Microsoft Azure Data Engineer Associate
PROFESSIONAL EXPERIENCE:
Wells Fargo, Plano, TX. Sr. Azure Data Engineer
October 2022 Present
Responsibilities:
Developed end-to-end ETL processes in Azure Data Factory to migrate data from diverse sources into Azure Synapse Analytics;
increased data pipeline throughput by 2x.
Orchestrated data pipelines in Azure Data Factory to ingest source data stored in Azure Data Lake Storage (ADLS GEN 2) using
Delta Lake formats to enable analytics.
Developed ETL processes in Azure Data Factory to ingest data from external sources such as Blob Storage, ORC/Parquet/Text
Files into Azure Synapse Analytics.
Proficient in leveraging Informatica Intelligent Cloud Services (IICS) for data integration, data quality, and data governance tasks.
Skilled in designing, developing, and implementing end-to-end data integration workflows using IICS components such as Cloud
Data Integration, Cloud Application Integration, and Cloud Data Quality.
Worked on configuring and managing cloud data warehouses and data lakes using IICS capabilities.
Implemented advanced data governance and security measures within IICS to ensure compliance with regulatory requirements and
safeguard sensitive data throughout its lifecycle.
Parsed complex nested data in Azure Databricks into normalized Azure Synapse Analytics relational format.
Built business logic using Azure Synapse Analytics Spark pools to flag records based on configurable rules.
Standardized various timestamp formats and parsed nested data using Azure Data Factory and Databricks.
Implemented Delta Lake architecture on Azure Databricks for efficient data lake storage.
Leveraged Delta tables to ensure ACID compliance for data operations, enabling reliable data pipelines.
Implemented a tiered data lake architecture with Azure Data Lake Storage Gen2 (ADLS GEN 2) using Delta Lake for batch and
streaming.
Designed the Azure Synapse Analytics data model focused on key attributes to drive reporting and analysis.
Progressed incremental ETL process in Azure Data Factory for scheduled data loads into Azure Synapse Analytics.
Established Azure Data Factory pipelines with lookup activities to optimize incremental data integration into a relational database-
Azure SQL Database.
Developed scalable data processing pipelines using Python and Azure Data Factory, optimizing data ingestion and transformation
workflows for improved efficiency and reliability.
Played a key role in an analytics project leveraging Azure analytics and big data services aimed to inform business decisions.
Knowledgeable in implementing and maintaining data center security measures to protect against unauthorized access, data
breaches, and cyber threats. Familiar with industry regulations and compliance standards such as HIPAA ensuring data center
operations adhere to regulatory requirements and industry best practices.
Managed the creation and maintenance of a comprehensive data catalog within Azure Purview, facilitating easy discovery and
understanding of banking data assets across the organization.
Utilized Azure Purview's advanced scanning capabilities to automatically identify and classify sensitive healthcare data, such as
patient records and medical imaging files, enabling better data protection and access controls.
Ensured compliance with healthcare regulations, such as HIPAA, GDPR, and HITECH, by leveraging Azure Purview's data
discovery and classification capabilities to identify compliance risks and implement appropriate remediation measures.
Executed various data transformation tasks in Azure Databricks to structure data for downstream use in Azure Analysis Services.
Built data processing pipelines using Python, leveraging libraries such as Pandas and NumPy for data manipulation and analysis,
contributing to enhanced efficiency and accuracy.
Engineered distributed data processing solutions with PySpark, optimizing performance through RDDs and DataFrames, resulting
in a significant reduction in processing time and resource utilization.
Utilized PySpark within Azure Databricks for advanced data transformations, demonstrating expertise in efficiently handling large-
scale data processing tasks.
Created a secure network on Azure using NSGs, load balancers, autoscaling, and Availability Zones to ensure 99.95% uptime
during peak data loads.
Set up near real-time monitoring alerts and automated issue remediation workflows with Azure Monitor Logic Apps; cut incident
response time by 50%.
Developed sample Informatica cloud data integration flows and mapping tasks as proof-of-concept while evaluating capabilities
vs traditional Informatica PowerCenter jobs.
Designed centralized access governance using Azure RBAC and AD groups enabling cross-subscription data sharing while
ensuring security compliance.
Implemented comprehensive security measures within environments, including role-based access controls, encryption, and
auditing, to safeguard sensitive data assets.
Enabled encryption-in-transit and at-rest using Azure Key Vault and other security features to adhere to strict regulatory data
handling policies.
Proficient in Azure networking, including virtual networks, subnets, and Azure Firewall.
Adept at extracting data from various sources, transforming it to meet business requirements, and loading it into the Snowflake data
warehouse efficiently and securely.
Extensive experience in Snowflake data warehousing, including data modeling, schema design, and ETL processes.
Designed and implemented real-time data pipelines using Snowpipe on Snowflake, facilitating automatic ingestion of streaming
data sources.
Collaborated closely with business analysts to convert business requirements into technical requirements and prepared low and
high-level documentation.
Proficient in configuring and optimizing Snowflake databases for performance and scalability, leveraging features such as
clustering, partitioning, and materialized views.
Demonstrated proficiency in designing, implementing, and managing Snowflake data warehousing solutions within the Azure
cloud environment, leveraging best practices for data modeling, optimization, and performance tuning.
Implemented zero-copy cloning in Snowflake, optimizing data replication processes for enhanced performance and efficiency.
Expertise in processing JSON, Avro, Parquet, ORC, and CSV formats for efficient data ingestion, transformation, and storage.
Automated routine data tasks using various software tools, streamlining processes and improving efficiency.
Created data visualizations, dashboards, executive reports, and models, providing valuable insights to key stakeholders.
Utilized Azure Cloud services for data engineering tasks, ensuring scalability, security, and reliability of data solutions.
Troubleshot and resolved real-time data issues, ensuring continuous operation of critical systems.
Collaborated with stakeholders to build and upgrade reports leveraging analytics tools, facilitating data-driven decision-making
processes.
Implemented Agile methodology to streamline development processes and enhance collaboration ensuring efficient delivery of
data solutions.
Environment: Azure Data Factory, ETL/ELT, IICS, Azure Databricks, Azure Synapse Analytics, Azure Data Lake Storage (ADLS GEN
2), Delta Lake, Relational database - Azure SQL Database, Azure Purview, Azure Analysis Services, Azure Monitor, Logic Apps, Key
Vault, Azure networking, Snowflake, Agile methodology.
Terminix, Memphis. Azure Data Engineer
April 2020 to September 2022
Responsibilities:
Migrated on-prem SQL Server containing mission-critical datasets representing decades of operational history into a reliable Azure
cloud platform leveraging SHIR connectivity.
Extracted terabyte-scale IoT and transactional data from on-prem sources, establishing incremental replication into Azure Data
Factory.
Performed ETL tasks against replicated datasets to ensure quality, such as deduplication checks, null handling, and validation rules.
Worked with Informatica Power Exchange to integrate with IICS and read data from condensed files for loading into Azure SQL
Data Warehouse.
Leveraged IICS's advanced features such as event-driven triggers and dynamic mappings to automate data integration processes,
enhancing operational efficiency and reducing manual intervention.
Implemented a multi-region partitioning strategy for scaling and optimizing workloads during loading into Azure Synapse.
Implemented data lake solutions using Azure Data Lake Storage Gen2 for efficient storage, processing, and analytics.
Architected a modern cloud data platform leveraging Azure storage, compute, and analytics services, delivering TCO savings of
about 10%, while fully preserving existing on-premises capabilities.
Established a foundation for real-time analytics with Event Hubs to process millions of events per second into a data platform with
near-zero latency or data loss.
Wrote PySpark scripts to perform complex transformations, ensuring the seamless integration of diverse data sources.
Developed a robust ARM template to automate the deployment of Azure resources, streamlining the provisioning process and
reducing deployment time by 50%.
Designed and deployed Spark applications in Python, leveraging PySpark's resilient distributed datasets (RDDs) and
DataFrame APIs to handle large-scale data processing tasks, ensuring high throughput and fault tolerance.
Established Stream Analytics jobs with anomaly detection rules to identify and respond to issues preemptively based on changing
conditions across connected systems.
Implemented PySpark-based solutions to integrate Azure services like Azure Data Lake Storage and Azure SQL Database
(relational database), facilitating seamless data movement and ensuring data integrity across various data sources.
Proficient in overseeing the operation and maintenance of data center infrastructure including servers, storage systems,
networking equipment, and power distribution units (PDUs). Experienced in implementing and adhering to industry best
practices for maximizing uptime and minimizing downtime.
Implemented PolyBase in Azure environments for seamless integration and querying across diverse data sources, optimizing data
engineering workflows.
Built visually rich Power BI dashboards compiled from Stream Analytics outputs, providing a single pane of glass view into
nationwide energy systems and assets.
Leveraged Azure Cosmos DB and Azure Blob Storage to efficiently store and manage large volumes of IoT data with high
availability and scalability.
Loaded batch data into Delta Lake, enabling reliability, performance, and interoperability across the Azure analytics engines.
Developed end-to-end proofs-of-concept for an IoT use case ingesting real-time sensor data into Azure and leveraging Stream
Analytics for detection alerts. Gained hands-on with core concepts.
Implemented a serverless architecture for stream processing by building an Azure Event Hub pipeline publishing telemetry events
to Kafka clusters and Spark Streaming jobs.
Designed Azure Databricks architecture leveraging containers and CI/CD to rapidly develop and deploy machine learning models
for predicting equipment failures from IoT data.
Managed continuous data ingestion and processing workflows with Delta's built-in support for change data capture (CDC).
Led initiatives focused on migrating existing big data workloads like ETL and analytics from on-prem Hadoop clusters to Azure
HDInsight managed platform.
Created a Lambda architecture on Azure, allowing batch and real-time data to be queried in one analytics interface using services
like Synapse Spark pools.
Implemented metadata tagging standards throughout Azure data factory, allowing end-to-end lineage tracking of transformed
data to trusted sources.
Successfully led the design and execution of migration strategies to transition Teradata databases to Azure cloud infrastructure,
ensuring seamless continuity of operations and minimizing downtime.
Proven track record of developing end-to-end ETL processes and data integration solutions using Snowflake and Azure
technologies.
Developed complex Snowflake SQL queries and stored procedures to automate data transformations, enabling real-time analytics
and providing actionable insights to stakeholders for informed decision-making.
Spearheaded the successful implementation of Snowflake as the primary data warehousing solution within the Azure environment.
Developed and optimized complex ETL pipelines in Snowflake, enhancing data processing efficiency by 40% through schema
optimization and query performance tuning.
Involved in working with ETL tools and technologies such as Informatica PowerCenter, SSIS, Talend, and DataStage.
Worked on concepts like Partitioning, Bucketing, Join optimizations, Ser-De, built-in UDF, and custom UDFs.
Strong experience using Spark RDD API, Spark Data frame/Dataset API, Spark-SQL, and Spark ML frameworks for building end-
to-end data pipelines.
Implemented DevOps culture utilizing GitHub workflows alongside Azure boards and backlogs to increase data and site reliability
engineering collaboration.
Identified data relationships and developed solutions in various data blending tools, enhancing data analysis capabilities.
Led the data wrangling and ingestion of data sources from multiple parties into analytics platforms.
Devised business rules for data acceptance, translation, and mapping, ensuring data integrity and reliability.
Transformed data into reliable and easily updatable data extracts for use in reporting and modeling, optimizing efficiency and
accuracy.
Implemented Agile methodology to streamline project delivery, fostering collaboration between cross-functional teams and ensuring
efficient development cycles.
Environment: Azure Data Factory, IICS, Azure Data Lake Storage, Event Hubs, Stream Analytics, Power BI, Azure Synapse Analytics,
Azure Databricks, PySpark, Delta Lake, Azure SQL Database (relational database), Teradata Azure VMs, Azure Site Recovery, Event
Grid, Key Vault, Container Registry, Blockchain Workbench, Power Platform, GitHub, Azure DevOps, Snowflake, Agile methodology.
GEICO, Dallas TX. Data Engineer
March 2018 to March 2020
Responsibilities:
Successfully designed and implemented an enterprise-level Data Lake enabling advanced analytics and processing of large-scale,
high-velocity datasets.
Proficiently developed and deployed Azure Analytic Services tabular models to meet business intelligence and reporting
requirements.
Utilized Control-M's monitoring and alerting functions to promptly detect and resolve job failures or performance concerns.
Experienced in crafting complex Transact-SQL (T-SQL) queries, stored procedures, functions, and triggers for efficient data
retrieval, manipulation, and database management.
Designed and implemented sophisticated data pipelines and transformations using Azure Data Factory (ADF) and PySpark in
Databricks to fulfill intricate data flow needs.
Executed resilient Python, Spark, and Bash scripts to optimize seamless data transformation and loading across diverse hybrid
environments, showcasing dedication to efficient and effective data processing and management.
Utilized Apache Spark to enhance intraday and real-time data processing through SQL and Streaming modules, maximizing
capabilities.
Implemented Spark SQL optimizations in Scala and Python, enhancing data processing efficiency through seamless RDD to Data
Frame conversions, resulting in accelerated analysis and improved system performance.
Utilized Informatica's capabilities to improve data quality, integrity, and governance across the ETL process, reducing errors and
ensuring data reliability.
Implemented Apache Spark pools within Synapse for big data processing and transformation tasks, optimizing for both batch and
streaming data.
Developed a scalable ETL framework using Spark Data Sources and Hive objects to facilitate smooth migrations from RDBMS
systems to Data Lakes, showcasing proficiency in data architecture and optimization.
Proficient in Python for crafting and deploying efficient data pipelines essential for big data ecosystems.
Proficient in SQL for database admin tasks including user management, backup, recovery, and data security maintenance.
Utilized NoSQL databases such as MongoDB and Cassandra to optimize scalability and performance for high-velocity transactions
and unstructured data, enabling flexible data models for dynamic business applications.
Implemented comprehensive database imports and exports using SSIS and DTS, enhancing data integration efficiency and reliability
enterprise-wide.
Utilized JIRA reporting tools to enhance project transparency and accountability through insightful metrics and progress updates.
Implemented Agile metrics and reporting mechanisms to monitor team velocity, sprint progress, and key performance indicators,
facilitating data-informed decision-making and project forecasting.
Environment: Azure Analytic Services, Azure Data Factory, Azure Databricks, PySpark, Python, Apache Spark, MongoDB, HBase,
MySQL, ETL, Hadoop, HDFS, HIVE, SQOOP, CI/CD, Python, SSRS, SSIS, Tableau, Pyspark, Scala.
Kroger, Cincinnati, OH. Big Data Engineer
July 2016 to February 2018
Responsibilities:
Designed and developed the applications on the data lake to transform the data according to business users to perform analytics.
Responsible for managing data coming from different sources and involved in HDFS maintenance and loading of structured and
unstructured data.
Proficient in designing, scheduling, and orchestrating complex data workflows using Apache Oozie, ensuring efficient data
processing and coordination.
Experience with Apache Kafka, including setting up, configuring, and managing Kafka clusters to facilitate real-time data processing
and messaging within a distributed system architecture.
Happened to work on different files like CSV, txt, and a fixed width to load data from various sources to raw tables.
Conducted data model reviews with team members and captured technical metadata through modeling tools.
Implemented ETL process wrote and optimized SQL queries to perform data extraction and merging from SQL server database.
Extracted data from RDBMS using Sqoop and stored it in HDFS for further processing.
Applied data visualization techniques and designed interactive dashboards using Power BI to present complex reports, charts,
summaries, and graphs to team members and stakeholders.
Proficient in Netezza, demonstrating expertise in data warehousing, database management, and SQL querying within the Netezza
environment.
Administered and maintained the Hadoop Distributed File System using the Hadoop-Java API.
Analyzed data using Spark and Hive, generating insightful summary results for downstream systems.
Experience in loading logs from multiple sources into HDFS using Flume.
Worked with NoSQL databases like HBase in creating HBase tables to store large sets of semi-structured data coming from
various data sources.
Involved in designing and developing tables in HBase and storing aggregated data from Hive tables.
Developed complex MapReduce jobs for performing efficient data transformations.
Data cleaning, pre-processing, and modeling using Java MapReduce.
Experience in integrating diverse data sources through Web Services and orchestrating data pipelines using Shell scripting in Big
Data ecosystems.
Expertly managed the end-to-end lifecycle of Teradata databases, encompassing design, development, configuration, testing, and
troubleshooting, while seamlessly collaborating with Teradata support staff and technical teams, and delivering comprehensive
training and support to end users in the dynamic realm of big data.
Strong Experience in writing SQL queries.
Responsible for triggering the jobs using the Control-M.
Environment: Hadoop, Hadoop Distributed File System, MapReduce, Oozie, Kafka, Hive, Spark, Sqoop, Flume, HBase, Power BI,
Netezza, Web Services, Shell Script, SQL, Teradata, Control-M.
Workspace, San Francisco, CA. January 2014 to June 2016
Data Warehouse Engineer
Responsibilities:
Extensively used Informatica client tools Source Analyzer, Warehouse Designer, Mapping Designer, and Mapplets Designer.
Extracted data from different sources of databases. Created a staging area to cleanse and validate the data.
Designed and developed complex Aggregate, expression, filter, join, Router, Lookup, and Update Transformation rules.
Developed schedules to automate the update processes and Informatica sessions and batches.
Analyze, design, construct, and implement the ETL jobs using Informatica.
Developed mappings/Transformations by using a mapping designer, transformation developer, and mapplets designer
in Informatica Power Center 8.x.
Leveraged Apache NiFi to orchestrate complex ETL workflows, facilitating seamless data flow from various sources into the data
warehouse.
Used Informatica Power Center and Power Exchange for extracting, transforming, and loading data from relational sources and
non-relational sources.
Designed and implemented data models using Star Schema and Snowflake Schema, ensuring data is organized most efficiently
and meaningful manner.
Proficient in utilizing Talend for data integration, ETL processes, and data quality management.
Experienced in connecting various data sources and destinations, including databases, APIs, and cloud services, to Talend jobs.
Skilled in optimizing performance and ensuring data integrity within Oracle Data Warehouse environments.
Leveraged PowerCenter to transform and load different sources of data to the MDM database.
Proficient in designing, developing, and implementing data integration solutions using Informatica PowerCenter 8.x.
Proficient in designing and optimizing data warehouse solutions using MS SQL Server 2000, including schema design, ETL
processes, and performance tuning techniques to ensure efficient data storage and retrieval for analytical purposes.
Implemented automated build and deployment processes using Maven and SBT tools, enhancing efficiency and ensuring
consistency in the development and deployment of data warehousing solutions.
Designed basic UNIX scripts and automated them to run the workflows daily, weekly, and Monthly.
Scheduled Sessions and Batches on the Informatica Server using the Informatica workflow Manager.
Migrated Informatica objects and Database objects to Integration Environment and schedule using Autosys.
Implemented complex ETL processes using PL/SQL for data warehousing, ensuring efficient data extraction, transformation, and
loading, thereby optimizing data retrieval and analysis for business intelligence purposes.
Identified and created various classes and objects for report development.
Collaborated with business analysts to understand requirements and translate them into Tableau visualizations that address
stakeholders' needs and objectives.
Environment: Star schema and Snowflake schema, Informatica Power Exchange, Oracle, DB2, XML, Flat files, Oracle data
warehouse, Informatica Workflow Manager, Maven, SBT, Talend, Unix shell scripts, slowly changing dimensions, Informatica
PowerCenter 8.x, Oracle 11g, MS SQL Server 2000, SQL, PL/SQL, Tableau.
Keywords: continuous integration continuous deployment machine learning business intelligence database active directory information technology microsoft procedural language Arizona California Delaware Ohio Texas

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)
[email protected];4942
Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: