Big Data Operations Lead Cloud (GCP OR AWS ORAzure), PySpark, BigQuery_San-Jose,CA & Airflow_ at Remote, Remote, USA |
Email: [email protected] |
From: Sunita Rani, Scalable systems [email protected] Reply to: [email protected] Hi This is the requirement from TCS. Please share resume of your candidate for this below requirement at [email protected] Job Title- Big Data Operations Lead Cloud (GCP/AWS/Azure), PySpark, BigQuery & Airflow Location- San-Jose,CA TCS Contract positions: Skills, Roles and Responsibilities (Google/AWS/Azure public cloud, PySpark, Big Query and Google Airflow) Participate in 24x7x365 SAP Environment rotational shift support and operations As a team lead you will be responsible for maintaining the upstream Big Data environment day in day out where millions of finanacial data flowing through, consists of PySpak, Big Query , Datgaproc and Google Air flow You will be responsible for streamlining and tuning existing Big Data systems and pipelines and building new ones. Making sure the systems run efficiently and with minimal cost is a top priority Manage the operations team in your respective shift, You will be making changes to the underlying systems This role involves providing day-to-day support, enhancing platform functionality through DevOps practices, and collaborating with application development teams to optimize database operations.. Architect and optimize data warehouse solutions using BigQuery to ensure efficient data storage and retrieval. Install/build/patch/upgrade/configure big data applications Manage and configure BigQuery environments, datasets, and tables. Ensure data integrity, accessibility, and security in the BigQuery platform. Implement and manage partitioning and clustering for efficient data querying. Define and enforce access policies for BigQuery datasets. Implement query usage caps and alerts to avoid unexpected expenses. Should be very comfortable with troubleshooting Linux-based systems on issues and failures with good grasp of the Linux command line Create and maintain dashboards and reports to track key metrics like cost, performance. Integrate BigQuery with other Google Cloud Platform (GCP) services like Dataflow, Pub/Sub, and Cloud Storage. Enable Bigquery through tools like Jupiter notebook, Visual Studio code, other CLIs Implement data quality checks and data validation processes to ensure data integrity. Manage and monitor data pipelines using Airflow and CI/CD tools (e., Jenkins, Screwdriver) for automation. Collaborate with data analysts and data scientists to understand data requirements and translate them into technical solutions. Provide consultation and support to application development teams for database design, implementation, and monitoring. Proficiency in Unix/Linux OS fundamentals, / perl /python scripting, and Ansible for automation. Disaster Recovery & High Availability Expertise in planning and coordinating disaster recovery principles, including backup/restore operations Experience with geo-redundant databases and Red hat cluster Accountable for ensuring that delivery is within the defined SLA and agreed milestones (projects) by following best practices and processes for continuous service improvement. Work closely with other Support Organizations (DB, Google, PySpark data engineering and Infrastrcture teams) Incident Management, Change Management, Release Management and Problem Manage Sunita Rani Email ID [email protected] [www.scalable-systems.com]Scalable Systems | [www.scalable-systems.com]Inspiring Innovation Big Data | Analytics | Integration | Intelligence www.scalable-systems.com Keywords: continuous integration continuous deployment database California Idaho Big Data Operations Lead Cloud (GCP OR AWS ORAzure), PySpark, BigQuery_San-Jose,CA & Airflow_ [email protected] |
[email protected] View All |
11:45 AM 22-Feb-25 |