Data Architect(Chicago, IL Hybrid) at Chicago, Illinois, USA |
Email: [email protected] |
From: ashok, hclglobal [email protected] Reply to: [email protected] Data Architect Chicago, IL Hybrid Note:14+ Years profiles Data Engineering,Data Integration,Cloud Infrastructure Job Description: The Data Architect for Azure and Databricks will be responsible for designing and implementing scalable, high-performance data engineering solutions on Microsoft Azure and Databricks platforms This role involves overseeing the end-to-end design of data pipelines, working closely with data scientists, analysts, and other stakeholders, and ensuring data infrastructure is optimized for advanced analytics, machine learning, and business intelligence The Data Engineering Architect will play a key role in transforming raw data into actionable insights using modern cloud and big data technologies Key Responsibilities: 1. Data Engineering Architecture Design: a. Design and implement data engineering solutions on Azure and Databricks, ensuring scalability, reliability, and performance b. Architect end-to-end data pipelines and workflows for data ingestion, transformation, and storage, using Azure Data Factory, Databricks, and other Azure services c. Work with data architects to design cloud data models, data lakes, and data warehouses that integrate seamlessly with other business systems. 2. Data Integration & ETL Pipelines: a. Design, implement, and manage ETL (Extract, Transform, Load) processes to bring data from various sources into Azure and Databricks environments b. Leverage Azure Data Factory for orchestrating data flows, and use Databricks notebooks for data processing and transformation tasks 3. Big Data Solutions & Advanced Analytics: a. Architect data solutions on Azure Synapse Analytics, Databricks, and other relevant services to support big data processing, analytics, and machine learning workflows b. Develop and optimize large-scale Spark-based data pipelines using Apache Spark on Azure Databricks for processing structured and unstructured data c. Implement solutions for data storage and processing in Azure Data Lake Storage or Azure Blob Storage 4. Cloud Infrastructure & Automation: a. Design and maintain cloud infrastructure to support large-scale data operations in Azure, leveraging services like Azure Kubernetes Service (AKS), Azure Virtual Networks, Azure Key Vault, and more b. Implement automation for continuous integration and continuous deployment (CI/CD) pipelines for data pipelines and notebooks using Azure DevOps and GitHub Actions c. Ensure cloud resources are cost-efficient, optimized for performance, and maintainable 5. Collaboration & Stakeholder Management: a. Work closely with data scientists, analysts, and business stakeholders to understand data needs and design appropriate solutions b. Provide technical leadership and mentoring to the data engineering team, ensuring high-quality code, optimal pipeline performance, and best practices c. Collaborate with business teams to ensure data solutions align with business goals and objectives, driving actionable insights and decision-making 6. Security, Governance, and Compliance: a. Implement data security best practices on Azure and Databricks, ensuring compliance with data privacy regulations such as GDPR, HIPAA, and SOC 2 b. Leverage Azure Active Directory (AAD) for role-based access control (RBAC) and ensure appropriate encryption, masking, and access policies are in place c. Develop and maintain a governance model for data access, data quality, and metadata management 7. Performance Optimization & Troubleshooting: a. Continuously monitor and optimize data pipelines for performance, reliability, and cost efficiency, ensuring minimal downtime b. Troubleshoot and resolve issues related to data quality, pipeline failures, and performance bottlenecks c. Implement data lineage tools and practices to trace the flow of data across the system and ensure data integrity 8. Documentation & Reporting: a. Document architecture, pipeline designs, workflows, and best practices b. Provide regular updates to stakeholders on project status, performance metrics, and improvements to the data ecosystem. Tech Stack Details: Cloud Platform (Azure) Core Azure Services for Data Engineering: Azure Data Factory Orchestrates and automates data workflows, handles ETL/ELT processes, and integrates on-premise and cloud-based data Azure Databricks A unified analytics platform optimized for Apache Spark, providing data processing, machine learning, and data engineering capabilities Azure Synapse Analytics A powerful analytics service that integrates big data and data warehousing, supporting SQL, Spark, and pipelines Azure Data Lake Storage Gen2 A scalable and secure data lake solution for storing large amounts of unstructured data Azure Blob Storage A highly scalable object storage solution that can hold both structured and unstructured data Azure SQL Database / Azure SQL Data Warehouse (Synapse SQL Pools) Managed relational databases for transactional workloads and data warehousing Azure Key Vault Securely stores secrets, keys, and certificates used in data pipeline security and compliance Keywords: continuous integration continuous deployment Illinois Data Architect(Chicago, IL Hybrid) [email protected] |
[email protected] View All |
12:45 AM 04-Feb-25 |