Home

Senior GCP Data Architect | H1B| 20+ year| San Jose CA| - Senior GCP Data Architect
[email protected]
Location: San Jose, California, USA
Relocation: Yes
Visa: H1B
Resume file: Saravanan M_ GCP Data Architect_1769184281007.docx
Please check the file(s) for viruses. Files are checked manually and then made available for download.
Senior GCP Data Architect with over 20 years of expertise and hands-on with 8+ years specializing in Python, SQL, and GCP services. Proven track record in transforming complex data into actionable insights, with expertise in building scalable Dataflow streaming pipelines, orchestrating complex workflows in Cloud Composer (Airflow), and optimizing Big Query for petabyte-scale datasets. Adept at Requirements Elicitation and Stakeholder Collaboration, expertly translating complex functional needs into robust, scalable, and cost-optimized technical data architecture on GCP. Proven track record in designing data models for profitability and customer behavior analysis, and leading development teams to successful project completion.

Certification:
Completed Post Graduate Program in Artificial Intelligence and Machine Learning: Business Applications in UT Austin
Completed Google Generative AI Leader Certification
Google Al Essentials
Divide and Conquer, Sorting and Searching, and Randomized Algorithms
Building Batch Data Pipelines on GCP
Building Resilient Streaming Analytics Systems on GCP
Smart Analytics, Machine Learning, and Al on GCP
Modernizing Data Lakes and Data Warehouses with GCP
Practical Machine Learning
Google Cloud Platform Fundamentals: Core Infrastructure, Big Data and Machine Learning Fundamentals
R Programming
Python Data structures and programming
Microsoft certified Solutions Associate (MCSA) in Querying Microsoft SQL Server 2012, Administering Microsoft SQL Server 2012 Databases, and Implementing a Data Warehouse with Microsoft SQL Server 2012
Very good experience in Agile methodology and agile certified Scrum Master (CSM)

Highlights
Technical Tools: Google Tools (Dremel/Google SQL, Google Big Query, piper, Cider, Apps Script, Google analytics, Data Studio), Snowflake, Data Bricks, GCP vertex Al, Looker, Power BI, SSRS, SQL, Nltk, TFS, SVN, SSIS, RStudio, sklearn, Neural Networks, DBT, Terraform, Power Builder 9.0
Reporting & Data Architecture: Expert-level Big Query skills including data modeling, performance optimization, and ingestion. Extensive hands-on experience with GCP services like Cloud Storage, Vertex Ai, and developing batch/streaming data pipelines.
Customer Lifecycle Analytics & Spending Behavior Analysis: Self Service and Enterprise reporting Analytics and measurement, Audience segmentation, Personalization and targeting, Vision, Mission, OKRs, Roadmaps, and execution, Privacy compliance and SOX Audit, Marketing, Finance and Healthcare domain,
Programming & Data Science: Expert-level proficiency in Python (including data engineering libraries like Pandas, PySpark, NumPy and Beam) and a solid command of advanced SQL/Google SQL for complex querying, data processing, and performance tuning.
Automation: DevOps & MLOps: CI/CD Pipeline Development (Google Cloud Build, TFS), Automated Testing (Pytest), Docker.

Awards & recognitions:
Received outstanding performance award in HCL multiple times (2013,2018)
Received outstanding performance award from customer Thomson Reuters & CEB
Education:
Alagappa Chettiar College of Engineering, Madurai Kamaraj University, Madurai, India. Bachelor of Engineering, Electrical and Electronics.

Work Experience.

Client: Google. Nov 2015 - Till Now
Role: Date Senior Data/ETL Architect
Data Infrastructure:
Led Business Analysis tasks and Stakeholder Collaboration sessions with Product and Analytics teams to elicit and finalize data requirements, ensuring the MLOps infrastructure directly supported strategic business goals (e.g., buyer segmentation)."
Engineered and automated the end-to-end data pipeline on GCP moving the marketing person data from google internal platform to GCP. Used Python to read the data from Google Storage and cleaned the titles and pushed them to BigQuery. Used Vertex Ai auto ML model to predict buyer segmentation based on title, country and account segment, Did the feature engineering to the model serving using loss function, optimizers, and sigmoid classification.
Built an Account validation pipeline, moving the files from ftp location to GCS and did the transformation and created a table in big query. Using beautiful soup library, web scraping pulled the accounts data based on domain and website and using Gemini 2.5 pro to find the specific signal (public domain, active etc.) in the data and segregated the valid and invalid account.
Architected and implemented Snowflake-based analytical data platforms supporting large-scale enterprise reporting, research analytics, and content-driven insights.
Data Cleaning, transformation, and ingestion using python.
We built Marketing funnel Lead creation, campaign member creation, engagement, MQL, SAL, QSO, LTC, LTO
We integrated different machine learning algorithms and enabled measurements.
Built the scoring model for evaluating the contacts quality.
Designed end-to-end ELT pipelines using Snowflake and Databricks to ingest, transform, and curate high-volume structured and semi-structured datasets used for business intelligence and decision analytics.
worked on various data sources csv, excel, Json, Google Analytics & Sales force and built data warehouse in Databricks (Delta Lake, Databricks SQL) for GCP Marketing Database Operations and strategy team.
Implemented DBT (Data Build Tool) pipelines for modular SQL transformations, implementing macros and Jinja templating to reduce code redundancy by 40%.
Working on third party tools like Boomi to create contacts and leads in Salesforce.
Architected end-of-the-end data platforms on Big Query, Cloud Composer, and Dataflow, translating complex functional needs (e.g., real-time personalization) into resilient, scalable technical data architectures.
Integrated data from multiple enterprise sources (relational databases, cloud storage, APIs, and flat files) into Snowflake using scalable ingestion and validation patterns.
Big Query slot consumption by 30% and query costs by 20% through advanced partitioning, clustering, and rewriting JOIN logic on nested structures.
Migrated data from Snowflake to Big Query
Drove Big Query performance optimization by refactoring Google SQL scripts, implementing partitioning and clustering strategies, and optimizing data models to reduce query costs and improve dashboard retrieval times for 500+ users.
Enabled data lineage and monitoring capabilities to track data movement, transformations, and SLA compliance across Snowflake pipelines.
Developed custom Apache Beam pipelines in Python to ingest streaming clickstream data via Pub/Sub, handling late-arriving data and windowing logic before writing to Big Query
Managed and orchestrated complex ETL workflows using Apache Airflow (Cloud Composer), creating, and scheduling Python-based DAGs for daily data pipeline execution, monitoring, and alerting.

Reporting Architecture & Dashboard Development
Collaborated extensively with senior management and marketing stakeholders to define technical requirements for new reporting solutions, translating ambiguous business needs into detailed technical design specifications.
Working on Plx and Databricks (Delta Lake, Databricks SQL) ETL pipeline to build dimensions and fact tables and designing the foundation layer in Plx, Data studio and Looker reporting tool which includes 600 attributes, metrics, and filters.
Architected and delivered enterprise-level reporting solutions, including dashboards for customer lifetime value (LTV) and fractional attribution, serving over 500 users in the marketing team.
Led the end-to-end design of reporting solutions for program profitability and performance, architecting data flows and transformation logic in Big Query and Snowflake to power executive-level dashboards in Looker.

Data Quality:
Audit the marketing database person s data every month and archive the historical data and maintain the database with qualified data.
Conducted A/B testing to compare site performance. The experiment involved adding unique experiment IDs to the UTM signals to measure the impact of different variations.
Implemented data access policy to the end users according to their roles and responsibility.
Implemented data masking on PII data.

Automated Testing & CI/CD:
Provided technical leadership during development and SQA testing phases by designing automated testing and validation frameworks for data pipelines and ML models.
Established an automated testing framework using Pytest and Great Expectations to enforce data quality contracts and schema validation within ETL processes, preventing downstream data corruption.

Master Catalog and SQL analysis:
We created consolidated products, pillars, and user master to enable unified data usage.
Consolidated data from different applications and exposed the dimensional model to Analyst and Data Science team for querying and data exploration.

Corporate Executive Board (CEB). Jan 2014 - Aug 2015
Sr Technical Manager
Designed data models for Data Migration, Master Data Management (MDM), and Metadata Management to align with enterprise data platform requirements.
Implemented MDM and Data Quality Services (DQS) to ensure data integrity and governance.
Developed SSIS packages to automate daily data ingestion processes and crafted monitoring scripts to optimize pipeline performance.
Led architecture design reviews, established Snowflake and Databricks best practices, and mentored engineers on data modeling, SQL optimization, and platform standards.
Constructed various ETL packages using containers, tasks, and transformations to support scalable data flows.
Configured SQL Server packages including logging, debugging, and package configurations to enhance reusability and performance.
Supported Power BI based analytics and executive dashboards by designing analytics-ready Snowflake datasets and optimizing queries for reporting performance.
Utilized Conditional Split, Derived Column, Lookup, Merge Join, Sort, Union All, Multi Cast, OLE DB command, Aggregate, and Table Difference transformations to optimize data pipelines.
Scheduled and monitored ETL packages using SQL Server Job Agent to maintain daily load consistency.
Validated and debugged Data flows and Control flows to ensure data quality and efficient processing.
Implemented checkpoint configurations and package reusability practices in TFS to support agile development methodologies.

Thomson Reuters Inc. Sep-2004 - Dec 2013
Tech Lead
Managed offshore teams while maintaining continuous Stakeholder Collaboration with business owners and end-users to align data warehousing and reporting systems with evolving business intelligence needs.
Spearheaded Requirements Elicitation and Business Analysis activities to define project scope and acceptance criteria. Subsequently translated functional needs into robust Conceptual, Logical, and Physical Data Models (CDM, LDM, PDM) to establish the enterprise data architecture.
Engineered scalable data integration solutions by extending CDM to LDM and PDM and addressing constraints and indexing as per target SQL Server requirements.
Crafted SSIS packages to automate daily data ingestion and monitoring processes, aligning with performance tuning and data pipeline development best practices.
Developed stored procedures and automated metadata creation to facilitate rapid content release cycles and maintain data integrity.
Optimized stored procedures for performance enhancements in line with large scale data warehousing principles.
Generated SQL deployment scripts by comparing TFS and database server configurations, ensuring smooth release management.
Keywords: continuous integration continuous deployment artificial intelligence machine learning business intelligence database rlang Alabama Utah

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)
[email protected];6691
Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: