Home

Koushik - sr. Hadoop Cloud Engineer / Admin
[email protected]
Location: Manassas, Virginia, USA
Relocation:
Visa:
Resume file: Koushik_Hadoop Cloud Engineer_1763418348389.docx
Please check the file(s) for viruses. Files are checked manually and then made available for download.
Koushik R
sr. Hadoop Cloud Engineer

Number:971 910 2570

Professional Summary:
Having 9+ years of IT experience in Designing, deploying and managing large-scale Hadoop ecosystems in both Cloud (AWS) and On-Premises (Cloudera and Hortonworks) environment for the (Automative & Banking) Industries through Linux and Kubernetes server.
Hands on experience in Installing and configuring such as (Amazon EMR, EKS, EC2, IAM, CloudWatch, Batch job, Terraform, S3, CloudFormation, Hadoop Map Reduce, HDFS, Yarn, Hive, Spark, HBase, Zookeeper, Kafka, Sqoop, Ranger and Oozie) Services.
Hands-on experience in upgrading the cluster/patches on Cloudera and Hortonwork platforms.
Hands on Experience in managing multiple clusters like for Cloud (Amazon EMR, EKS) and On-Prem (Cloudera and HDP Platforms).
Hands-on experience in working with infrastructure teams to manage/upgrade server patching and system fixes.
Hands on experience integrating Hadoop with third-party tools like Terraform, IAM, S3, CloudWatch, Grafana and Unravel.
Hands on experience in monitoring the health of Hadoop clusters and handling capacity planning through (Unravel, Splunk, CloudWatch, Grafana, Icinga, Simon) tools.
Hands on experience in and adding/removing instances on Cloud and on-prem platforms.
Strong knowledge of AWS EMR, Hadoop, Spark, Hive and Map-Reduce framework.
Experience in creating Kafka topics, providing access and troubleshooting issues.
Good understanding of Hive, Spark, MapReduce jobs and optimizing performance of memory management.
Hands-on experience doing firmware upgrades and OS patching on Rhel servers.
Hands-on experience in Managing and maintaining Windows Server environments, Active Directory, and group policies.
Hands-on experience in copying data copies files and data cleanup etc. from one cluster to another through distcp tools.
Experience in analyzing Log files for Hadoop and eco system services and finding root causes.
Hands-on experience in developing Automation (CI/CID) pipelines using Ansible and Jenkins to automate data pipeline deployment, testing and integration with other services.
Experience in scripting languages java, shell and python.
Experience in cluster capacity planning, performance tuning, cluster monitoring, and Troubleshooting Hadoop cluster.
Experience in maintaining resilient, scalable, and cost-optimized AWS environments (EC2, IAM, Load Balancers, S3 etc.).
Experience in Responsible for memory management, queue allocation, and distribution experience in Hadoop/Cloudera environments.
Experience in benchmarking, performing backup and disaster recovery of Name Node/ metadata and important sensitive data residing on cluster from backup recovery.
Experience in performing minor and major upgrades, commissioning and decommissioning of data nodes on Hadoop cluster.
Experience in managing Oozie workflows and Job Controllers for job automation - shell, hive, and Cron jobs.
Experience in (AWS) core services such as (EC2, S3, IAM, RDS, CloudWatch, CloudFormation, Terraform and Security Hub).
Hands-on experience in AI tools Copilot, Gemini etc.
I m also Certified Cloudera and AWS Solutions Architect Professional.
Technical Skills:
Big Data Ecosystem: Hortonworks, Cloudera, AWS EMR, EKS, EC2, IAM, Terraform, HDFS, S3, CloudWatch, IAM, CloudFormation, Batch Job, Hive, Hadoop MapReduce, Zookeeper, HBase, Spark, Sqoop, Oozie, Kafka, Nifi, Ranger and Yarn.
Database: MySQL, PostgreSQL, RDS.
Scripting Languages: Bash, JavaScript, Python Java Scripting, UNIX shell scripting.
Operating Systems: RHEL 6.x, RHEL7.x, Rhel8.x, Rhel9.x, RHEL10. x.
Management: AWS EMR, Cloudera and Hortonworks Platform.
Monitoring Tool: CloudWatch, Unravel, Grafana, Splunk, Icinga, Simon.
Cloud Computing: AWS EMR, Cloudera.
AI Tools: Copilot, Gemini.
Certified: Cloudera and AWS Solutions Architect Professional.


Professional Experience:

U.S. Bank, Minneapolis, MN Aug 2023 Present
Role: Sr. Hadoop Cloud Engineer
Responsibilities

Hands on experiencing with Cloud and on-prem Platforms such as (AWS EMR, Cloudera, Hortonworks) platform.
Hands on Experience in managing and deploying EMR clusters such as (EMR Persistent, Transient and Serverless Cluster.
Experience in (AWS) core services such as (EC2, S3, IAM, RDS, CloudWatch, CloudFormation, Terraform and Security Hub).
Hands on Experience in multiple domains which includes cloud computing, security, identity and access management.
Hands on Experience in Strong Knowledge on AWS Infrastructure & security services (EC2, IAM, Config, S3, cloudwatch, Batchjob, Terraform etc.)
Hands on experience in maintaining resilient, scalable, and cost-optimized AWS environments (VPCs, EC2, IAM, Load Balancers, S3, etc.).
Hands on experience in Code (IaC) using Terraform, Ansible, and CloudFormation for repeatable and auditable deployments.
Experience in administering the Linux systems to deploy Hadoop cluster and monitoring the cluster using Unravel, Grafana, Splunk and CloudWatch.
Hands on Experience in Integrating clusters with AWS services such as EMR, CloudWatch, S3, and Secrets Manager to enhance platform observability and performance.
Hands on experiencing in Installing and configuring such as (Amazon EMR, EKS, EC2, IAM, Terraform, Lambda, CloudWatch, Batch Job, Map Reduce, HDFS, Yarn, Hive, Spark, HBase, Zookeeper, Kafka, Sqoop, Ranger, Impala and Oozie) Services.
Hands on experience in Managing and maintaining Windows Server environments, Active Directory, and group policies.
Hands on experience in Installing new version upgrade/patches of CDP Private Cloud.
Experience in Managing user access, permissions, and security within the clusters.
Experience in Configuring ODBC configuration to set up for connecting Tubleau, SAS, Informatica, Data science supporting teams.
Back up or copying data from one host to another using scp, distcp, Teraform and Ansible tools from disaster recovery.
Familiar with CI/CD pipelines using Ansible, GIT, Jenkins for cloud operations.
Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage &review log files.
Day to day responsibilities working on User requests (Ctasks, Incidents) solving developer issues, deployments moving code from one environment to other environment, providing access to new users and providing instant solutions to reduce the impact and documenting the same and preventing future issues.
Experienced with Spark improving the performance and optimization of the existing algorithms Hadoop in using Spark, MapReduce and YARN.
Experience with Batch Processing scheduling (Batch job, Oozie, AutoSys) jobs.
Hands on experience in Performance, troubleshooting and resolution and investigation of spark, hive, big query, kafka logs)
Hands on experiencing security including SSL/TLS, Kerberos security authentication.
Hands on experiencing in Commissioning and decommissioning of the nodes on Hadoop clusters.
Architecture design and implementation of deployment, configuration management, backup, and disaster recovery systems and procedures.
Responsible for memory management, queue allocation, and distribution experience in Hadoop/Cloudera environments.
Hands-on experience in AI tools Copilot, Gemini etc.
I m also Certified Cloudera and AWS Solutions Architect Professional.
Environment: AWS EMR, EC2, CloudWatch, S3, IAM, CloudFormation, Cloudera, Hortonworks, Spark, HDFS, Map Reduce, Hive, HBase, Sqoop, Ranger, Solr, Spark, Oracle, MySQL, Kafka, Unix, Linux, Java, Shell, Python Scripting.
Ford Motor Company, Dearborn, MI Sep 2018 Aug 2023
Role: Sr. Cloudera Hadoop Administrator
Responsibilities

Hands-on experience in upgrading on-prem clusters Cloudera and Hortonworks platform to latest versions.
Experience as a Data platform engineer, which includes Kubernetes, Python and Jenkins.
Hands on experiencing in Installing and configuring such as (Big Query, Map Reduce, HDFS, Yarn, Hive, Spark, HBase, Zookeeper, Kafka, Sqoop, Ranger, and Oozie) Services.
Hands-on experience in working with infrastructure teams to manage/upgrade server patching and system fixes.
Worked on migrating Rhel servers and services from one host to another.
Hands on experience in Managing and maintaining Windows Server environments, Active Directory, and group policies.
Hands on experience in Install, configure, monitor, and maintain Hadoop ecosystem components (HDFS, YARN, Hive, Spark, etc.).
Hands-on experience in both Hortonworks (HDP 2.6.5 to 3.1.5) and CDP distribution (CDP 7.1.7 to 7.1.9) for 4 clusters ranges from DEV, QA, PROD, DR.
Back up of data from active cluster to a backup cluster using distcp.
Hands on experience integrating Hadoop with third-party tools like Grafana, Unravel, or JupyterHub.
Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage &review log files.
Worked through and supported RedHat Linux OS level upgrades to RHEL 7.7, Rhel8.x and performed firmware upgrades on servers.
Experience in troubleshooting and resolving hardware, software, and network issues across diverse environments, Opening HPE cases as well.
Experience in doing firmware upgrades and OS upgrades.
Experience in Building Rhel servers and adding hosts to Hortonworks Ambari and Cloudera Manager platform.
Hands-on experience in programming and scripting languages, such as Java, Scala, and Bash.
Worked with users to resolve after upgrade issues on Horntworks 3.1.0 and CDP private versions.
Day to day responsibilities include solving developer issues, deployments moving code from one environment to another environment, providing access to new users and providing instant solutions to reduce the impact and documenting the same and preventing future issues.
Experienced with Spark improving the performance and optimization of the existing algorithms Hadoop in using Spark Context, Spark-SQL, Data Frame, and YARN.
Hands on Experience with Hadoop ecosystems such as (Hive, Spark, MapReduce, Ranger, Kafka, Oozie, Sqoop, Solr).
Creating end to end Spark applications using Scala to perform various data cleansing, validation, transformation and summarization activities according to the requirement.
Experienced in adding/installation of new components and removal of them through Hortonworks Ambari and Cloudera Manager.
Monitoring systems and services through Ambari dashboard to make the clusters available for business.
Performance tuning the Spark, MapReduce, Hive jobs by changing the configuration properties and using broadcast variables.
Experienced in Setting up the Jira projects and volume setups for the new projects.
Architecture design and implementation of deployment, configuration management, backup, and disaster recovery systems and procedures.
Changing the configurations based on the requirements of the users for the better performance of the jobs.
Exported the analyzed data to the relational databases using SQOOP for visualization and to generate reports for the BI team.
Monitor logging & alerting Hadoop eco-systems with Icinga, Simon and Grafana Dashboard.
Responsible for memory management, queue allocation, and distribution experience in Hadoop/Cloud era environments.
Environment: Cloudera, Kubernetes, Hortonworks Ambari, Spark, HDFS, Map Reduce, Hive, HBase, Sqoop, Ranger, Solr, Kafka, Oozie, Spark, Oracle, MySQL, Rhel (7,8,9) Servers, Unix Linux Java, Shell Scripting, Ansible, Grafana.

Vanguard, Charlotte, NC Sep 2017 Aug 2018
Role: Hadoop Administrator
Responsibilities:

Worked as an admin in Horton works (HDP 2.2.4.2) distribution for 4 clusters ranges from POC to PROD.
Worked on installing and configuring HDP, Cloudera services for (Hue, Ambari views).
Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage &review log files.
Day to day responsibilities include solving developer issues, deployments moving code from one environment to another environment, providing access to new users and providing instant solutions to reduce the impact and documenting the same and preventing future issues.
Experienced in adding/installation of new components and removal of them through Ambari.
Installed and Configured Hortonworks Data Platform (HDP) and Apache Ambari
Installed and Configured Hadoop Ecosystem (MapReduce, Pig, and Sqoop. Hive, Kafka) both manually and using Ambari Server.
Implemented and Configured High Availability Hadoop Cluster (Quorum Based)
Installed and Configured Hadoop monitoring and administrating tools: Nagios and Ganglia.
Back up of data from active cluster to a backup cluster using distcp.
Periodically reviewed Hadoop related logs and fixing errors and preventing errors by analyzing the warnings.
Hands on experience working on Hadoop ecosystem components Hadoop Map Reduce, HDFS, Zookeeper, Oozie, Hive, Sqoop, Pig, Flume, Atlas.
Experience in configuring Zookeeper to coordinate the servers in clusters to maintain the data consistency.
Experience in using Flume to stream data into HDFS - from various sources.
Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce, Hive, Atlas and Sqoop as well as system specific jobs.
Installed Oozie workflow engine to run multiple Hive and pig jobs.
Worked on analyzing Data with HIVE and PIG +9.
Helped in setting up Rack topology in the cluster.
Implemented automatic failover zookeeper and zookeeper failover controller.

Environment: Hadoop Horton works (HDP 2.2.4.2), Cloudera, Spark, HDFS Map Reduce, Atlas, Pig Hive HBase Flume Sqoop, Windows 2000/2003 Unix Linux Java, Shell Scripting., Kafka, Oozie, Java, Oracle 10g, MySQL, Impala, Nagios, Ambari.

HP, Texas Sep 2016- Aug 2017
Role: Hadoop/Linux Administrator
Responsibilities:

Installation, Maintenance, Administration and troubleshooting of Sun Solaris 8,9 and Redhat 9, AS 3.0 servers on various hardware platforms that include Sun 4800, V480, 280R, 4500, 3500, Dell 6400, 2400, 1800 etc.
Installed and Configured Hortonworks Data Platform (HDP) and Apache Ambari
Installed and Configured Hadoop Ecosystem (MapReduce, Pig, and Sqoop. Hive, Kafka) both manually and using Ambari Server.
Performed automated installations of Operating System using Jumpstart for Solaris and Kickstart for Linux.
Worked on installing and configuring HDP, Cloudera services for (Hue, Ambari views).
Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage &review log files
Extensively worked on hard disk mirroring and stripe with parity using RAID controllers
Implemented a High Availability Cluster using 2 V480s, a T3 and Veritas Cluster Server.
Worked on setting up Kerberos and used it to grant access to the users.
Installation, Management, Configuration of LAN/WAN systems utilizing Cisco switches and routers.
Configured various services, devices as well as applications of UNIX servers and worked with application team to customize the environment. Worked with Apache and developed several UNIX scripts to automate web tasks
Configured firewall based on red hat Linux and FreeBSD 4.x that has three network interfaces.
Managed existing documentation for systems and created new procedures to support new products. Created documentation for disaster recovery project
Managed server on VMware and provided test environments on virtual machines.
Provided IT support to internal staff members.
Used Puppet to create Modules
Provided application support to large user groups.
Installed hardware, installed RHEL 3.0 OS and configured required network on 1000 Node HPC cluster.
Managed HPC cluster, performed hardware, BIOS and application upgrade.
Configured and managed Apache web server.
Managed software and hardware RAID systems.
Configured and maintained FTP, DNS, NFS and DHCP servers.
Manage user accounts and authentication process by NIS service.
Managed System Firewall utilizing IP Chains, IP Tables. Implemented SSH SSL.
Managed user disk usage by setting up quotas.
Updated software packages and applied security patches.
Performed hardware maintenance, upgrades and troubleshooting on workstations and servers.

Environment: Solaris 8,9, Red hat Linux, AS 3.0, Veritas Volume Manager 3.x, 4.0, Veritas Cluster Server 4.1, Cisco Routers, Sun 4800, V480, 280R, 4500, 3500, Dell 6400, 2400, 1800 Redhat 8, 9

Education:

Bachelor s in information technology GITAM University, INDIA (2009-2013)
Master of Science in Information Systems Stratford University, Washington DC, USA (2014-2016)
Certification:
AWS Certified Solutions Architect Professional - Credly
Keywords: continuous integration continuous deployment quality analyst artificial intelligence business intelligence sthree rlang information technology hewlett packard Michigan Minnesota North Carolina

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)
[email protected];6436
Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: