Home

Josphat Kiaritha - Data Architect
[email protected]
Location: Columbus, Ohio, USA
Relocation: No
Visa: US Citizen
Josphat Kiaritha
[email protected] | https://www.linkedin.com/in/josphat-kiaritha-6080351a2/
443-512-4340

PROFESSIONAL SUMMARY
Goal-oriented IT professional with strong background in Data Analytics, Architecture & Modeling.
Strong Knowledge of OLTP, ODS and OLAP Architectures which include Inmon and Kimball approaches coupled with Data Vault and Hybrid approaches.
Proficient in Requirement gathering, source systems analysis and architectural approach design.
Acted as the Subject Matter Expert (SME) in claims applications maintained and supported by the DataWarehouse.
Designed and implemented Azure Fabric Datawarehouse utilizing Denodo data virtualization, enabling seamless data democratization within the organization.
Proficient in Data Modeling from conceptual, logical to physical data models using advanced modelling principles.
Proficient with Relational, Analytical, UML and XML Modeling using various Data Modeling tools.
Worked extensively with IBM Info-sphere Data Architect, Erwin and ER Studio for several projects in both OLAP and OLTP applications.
Utilized NoSQL databases to create scalable solutions that efficiently store and manage Big Data applications.
Utilized Azure Fabric for rebuilding crucial metrics and reports with minimal business impact.
Created Data Lakes for various Big Data projects running in Hadoop and AWS.
Creating Metadata repositories in Github, Data Dictionaries and Glossaries to be used when creating data models.
Have worked with ETL tools to migrate data from various OLTP databases to ODS and EDW.
Created workflows with Data Stage and Informatica developer tools.
Developed high-performance NoSQL database systems to handle large volumes of unstructured and semi-structured data.
Managed performance and tuning of SQL queries and fixed the slow running queries.
Proficient in creating Views, Materialized views & Partitions for the Oracle warehouse.
Experienced in RDBMS systems such as Netezza, Oracle, DB2, Teradata, MS SQL Server, SQLite.
Experienced in analysis of complex data sets and applications and resolving defects that are critical to smooth running of Business functions.
Experienced in designing and developing CRM applications such as Salesforce objects, workflows and reports.
Strong analytical ability with the capability to determine the root cause of problems and provide solutions to the issues.
Ability to juggle multiple tasks with competing and frequently changing time-sensitive deadlines and priorities.
Excellent communications skills and work well with end users and management teams.
Strong project management skills: Plan, organize, monitor, and control projects, ensuring efficient utilization of resources, to achieve project objectives and deadlines.
Over 5 years of experience in Data Virtualization with Denodo.
Performed Data Governance, standardization of data, normalization, Master Data Management (MDM) and other transformations with Data Stage and Informatica Developer tools.
Optimized read and write operations in NoSQL databases through the use of indexing and catching strategies.
EDUCATION AND TRAINING
University of Maryland (UMBC)
Bachelor of Science in Chemical Engineering
2003 2008

CERTIFICATES
Informatica Data Quality for Developers
Informatica Data Quality for Analysts
AWS Practitioner
Power BI Essentials

EMPLOYMENT AND EXPERIENCE
ForceV - Estes Express Lines, Richmond, VA June 2023 Current
Senior Data Modeler Architect
Responsibilities:
Gathered business requirements from multiple areas within the organization to build an enterprise data model which would serve as a single source of truth for all data needs.
Translated business requirements to technical requirements by creating technical design documents that detailed in depth all the tasks that were required at all steps within the data life cycle.
Secured NoSQL databases through robust access controls, encryption, and security best practices.
Integrated data from multiple sources to build a data fabric Datawarehouse utilizing Denodo data virtualization to aid in data democratization within the organization.
Created conceptual, logical and physical data models from scratch by reverse engineering the models from the databases.
Designed and implemented data warehousing solutions on Azure using services like Azure Synapse Analytics, enabling efficient data storage and analysis.
Implemented scalable solutions using Azure Fabric, ensuring that the architecture could handle growing data demands efficiently.
Involved in data analysis, data wrangling, data mining and solutioning for the various use cases within the organization to support the business and technical teams.
Developed and deployed microservices-based applications using Azure Fabric, enhancing agility and scalability.
Leveraged the flexibility of NoSQL databases to support rapid application development cycles.
Created source to target mappings that were used for reference during development on the designs of the databases, tables and integration of data.
Collaborated with other Data Architects and Data Governance teams to come up with data standardization rules.
Optimized read and write performance in MongoDB through effective indexing and query optimization.
Supported development teams during their ETL processing of data by analyzing and finding solutions to issues that were encountered.

Environments: DB2, SQL Server, IBM InfoSphere Data Architect, Miro, AWS, Azure, Denodo, SSIS

DXC - One America - Contractor, Indianapolis, IN April 2021 June 2023
Senior Area Data Architect (ADA)
Responsibilities:
Utilized Azure Fabric for container orchestration, enabling seamless deployment and management of containerized applications.
Integrated data from multiple sources using Azure services like Azure Data Factory, ensuring smooth data flow and consistency.
Analysed Business Requirements and created Data and System Flow Diagrams as a visual aid to Developers and other stakeholders to enable them to quickly and easily understand the project architecture and the processes that should be implemented.
Ensured security in MongoDB by implementing robust authentication mechanisms and encryption protocols.
Created Conceptual and Logical Data Models with ER Studio to aid in the Business discussions at the onset of various projects.
Collaborated with other Data Architects and Data Governance teams to define and enforce data standardization rules using Azure Fabric.
Integrated data from multiple sources using Azure services like Azure Data Factory, ensuring smooth data flow and consistency.
Created highly normalized Data Models for front end systems such as REPS to support front end OLTP systems.
Integrated MongoDB with various ETL processes to streamline data extraction, transformation, and loading.
Interpreted Business Requirements to technical requirements by creating the Architectural Approach and giving the complete implementation steps by creating Technical Requirements Documents.
Utilized Azure services like Azure Databricks and HDInsight for big data processing, enabling advanced analytics and insights.
Supported development teams during their ETL processing of data by analyzing and finding solutions to issues that were encountered.
Developed real-time data processing solutions using MongoDB to support fast decision-making applications.
Created Dimensional Models using Kimball s approach having fact tables and the different types of dimension tables such as Slowly changing, conformed etc.
Used AWS Glue catalogue to prepare and transform data for analytics, machine learning, and other data processing tasks.
created ETL workflows in AWS Glue that automated the extraction, transformation, and loading of data.
Collaborated with different stakeholders withing the data life cycle from source teams, business teams and developers.
Provided support for the various stakeholders whenever need arose.

Environments: DB2, SQL Server Data Base, ER Studio Data Architect, Miro, AWS, Azure, Snowflake

IBM - USAA Insurance, San Antonio, TX Jan 2020 March 2021
Senior Data Modeler Architect
Responsibilities:
Profiled source systems using SQL queries and IBM Information Analyzer to determine data anomalies, data patterns and statistics that helped in coming up with the most appropriate Business Rules to be applied during the ETL process.
Configured MongoDB to support high-throughput operations by tuning write concern and read preferences.
Collaborated with architects to review candidate architecture for projects in Enterprise Architecture.
Leveraged Azure Fabric for seamless integration with other Azure services, optimizing data workflows and processes.
Implemented real-time analytics solutions on Azure using services like Azure Stream Analytics, enabling instant insights and decision-making based on live data streams.
Assessed system impacts due to data migration from legacy databases to enterprise data warehouse.
Implemented Azure Fabric to create conceptual, logical, and physical data models from scratch, ensuring comprehensive data representation.
Created highly normalized OLTP databases for our front-end systems.
Involved in standardization of OLTP front end systems to integrate data from the disparate sources to create the operational data store for near real time reporting.
Involved in architecting data migration and delivery solutions to AWS and Snowflake.
Utilized Azure Functions and Logic Apps for serverless computing, enabling efficient and cost-effective application development and deployment.
Transformed XML schemas into logical data models from XML Data Bases.
Utilized MongoDB's aggregation framework to perform complex data transformations and analysis.
Interpreted XML schema elements by the XML to Logical Data Model transformations
Developed and Implemented data matching processes for attributes to be in Informatica MDM.
Identified, validated and leveraged source and target database schemas, ensuring conformity and reusability.
Involved in data lineage and Informatica ETL mapping development, complying with data quality and governance standards.
Collaborated with development teams to design and implement MongoDB solutions that meet business requirements.
Created and enhanced future-friendly logical data models and conducted design walkthroughs with internal teams and end users.
Involved with the modeling of the Information Model that acted as a bridge between Business and IT teams.
Generated physical data models and DDL scripts for database objects, incorporating enterprise standards and industry best practices for the target database.
Created and administered database objects such as tables, sequences, history triggers, stored procedures etc, in sandbox environments for preliminary development and testing.
Supported database implementations, performance tuning such as query execution plan, data distribution and partitioning, issue resolution and clean-up efforts.
Coordinated with offshore ETL development and testing and reporting teams.
Managed large-scale distributed systems using NoSQL databases, optimizing for performance and availability.
Validated that semantics of data elements being reported align with business requirements.
Captured, validated and published metadata in accordance with enterprise data governance policies and MDM taxonomies.
Identified confidential and PII data elements and involved in enforcing appropriate protective measures such as tokenizing, masking, redacting etc. for data in flight and at rest.
Automated the setting up and configuration of needed AWS resources using cloud formation which helped in their management during their lifecycle.
Developed a data vault 2.0 proof of concept to compare between Inmon s 3NF data warehouse and Data Vault 2.0 data warehouse.
Environments: IBM Infosphere Data Architect, Snowflake, AWS Redshift Spectrum, AWS S3, AWS Glue, AWS Cloud Formation, IBM Information Analyzer, Hadoop Hive, Teradata, Netezza, SQL Server, IBM Data Stage, Tableau, Snowflake, Power BI, Altova XML Spy, data vault 2.0

Nationwide Insurance, Columbus, OH Nov 2017 Jan 2020
Senior Data Architect/Data Modeler
Responsibilities:
Profiled source data using Informatica Data Quality tool and custom queries to discover the various data patterns, data anomalies and to understand the business by relating the Business Requirements with the source data.
Documented all observations from Profiling Data highlighting all the data issues discovered, documenting the queries that led to the discovery of the issues and giving recommendations to the Subject Matter Experts, Developers and all other stakeholders.
Thorough data source data analysis which includes Mainframe systems and OLTP applications.
Created Source to Target mappings on Informatica Developer and on IBM Infosphere Data Architect with detailed joining criteria and transformations.
Created Bteq and Mload scripts to load data to Teradata applications.
Involved with creating Big Data applications that ran on Hadoop such as Smart Ride and Smart Miles applications.
Wrote complex data profiling SQL queries to profile data in Hadoop HDFS due to lack of connectivity of Hadoop hive to Informatica Data Quality tool.
Used Hadoop hive to run execute queries and create external tables to read files loaded into HDFS.
Acted as the Claims Subject Matter Expert at the Data Warehouse supporting 6 different claims applications and doing in depth data analysis in Teradata to find, trends, forecast and generate canned and adhoc reports for our business partners.
Developed complex balancing reports with Microstrategy and Tableau for the claims applications to make sure all claims from our source systems running in Teradata and Oracle were correctly loaded to all the downstream applications.
Analysed daily and monthly error reports for our claims, billing and finance applications most of which run in Teradata, Oracle and Hadoop to make sure all systems were functioning as expected and worked to resolve any errors that arouse. This called for investigation through rigorous data analysis and consulting with various stake holders.
Involved with the migration of the Claims Enterprise Data Warehouse running in Teradata to Snowflake and AWS by creating a new hybrid Unified Claims Analytical System (UCAS).
Created and maintained the various Data Models for all projects I was involved in which were Conceptual, Logical and Physical Data Models and maintained the data models on Git Hub by making changes to the models whenever changes were requested.
Created an Enterprise Data Warehouse from scratch utilizing for the P&C Data warehouse which stored historical data and created data marts on top of the data model for the various lines of businesses. Created views to expose the data for reporting purposes and created the reports on MicroStrategy and Tableau.
Led in creating a PII database for compliance with new business regulations to handle sensitive customer information which involved manually writing very complex SQL queries to integrate PII data from over 60 sources and enable easy customer searches on the front-end user interface.
Led in an effort to migrate data from a Finance Mainframe System to AWS within 6 months that enabled the organization save more than $1,000,000 per month in Licensing fees.
Created Data Lakes for claims in AWS S3 that was used to create various data marts for business reporting then converted the data structure to parquet in glue catalogue.
Created various data marts from the Data Lake to support various business units within the enterprise.
Created AWS data pipelines to perform various activities with data stored in various AWS tools.
Proficient in using IBM Infosphere Data Architect and Erwin modelling tool, Reverse engineering, Forward engineering and Complete Compare functions to create, update and resolve issues with existing data models.
Created and Implemented Naming Standards and Domains on IBM Infosphere Data Architect to enable the entire Enterprise Data Management Team to have set Enterprise Standards.
Created and documented the models, data Workflow Diagrams, Sequence Diagrams, Activity Diagrams and field mappings of all existing system.
Generated ad-hoc SQL queries using joins, database connections and transformation rules to fetch data from the source and SQL Server database systems.
Participated in migration of data sources to Salesforce apps from legacy CRMs, designing and developing Salesforce objects, workflows and reports.
Developed Data Model by Data Virtualization techniques using Denodo 5.5 / 6.0 by connecting to multiple data sources such as SQL Server, Oracle, Hadoop etc.
Involved in various Data Governance processes such as Master Data Management and Standardization of data using Informatica Developer.
Consolidated Data that the ETL tool was unable to create the Golden Records by setting notifications through the Human Task transformation on the MDM tool.
Involved in data governance tasks such as data profiling at an enterprise level to make sure data was complete, consistent and accurate before ingestion into the DW Data Lake.

Environments: IBM Infosphere Data Architect, Erwin 9.64, AWS Redshift Spectrum, AWS S3, AWS Glue, AWS RDS, AWS Data Pipeline, Informatica, Hadoop Hive, Teradata, Netezza, SQL Server, Informatica Data Quality (IDQ) Client, IBM Data Stage, Tableau, Snowflake, Power BI, Microstrategy, Cognos.

Tek Systems - Nationwide Insurance, Columbus, OH Sept 2018 Aug 2019
Requirements Lead
Responsibilities:
Conducted JAD sessions, gathered information from Business Analysts, Developers, end users and stakeholders to determine the requirements and various systems.
Analysed Business Requirements and created Data and System Flow Diagrams as a visual aid to Developers and other stakeholders to enable them to quickly and easily understand the project architecture and the processes that should be implemented.
Created Conceptual and Logical Data Models to aid in the Business discussions at the onset of various projects.
Analyzed source systems such as OLTP databases and the Operational Data Store (ODS) before pulling the data to our staging data bases in the Data Warehouse.
Interpreted Business Requirements to technical requirements by creating the Architectural Approach and giving the complete implementation steps by creating Technical Requirements Documents.
Generated canned and adhoc reports as per business need with Tableau and Microstrategy.
Involved with iteration management, determining project priorities, assigning work to team members, and managing the projects on RTC and Jira tools.
Created Epic Cards and Story Cards in Jira to track the in-flight projects and enable efficient resources management.
Created Base Views, Derived Views, Joins, Unions, Projection, Selection, and Union, Minus, Flatten Views, Interface and associations of data service layers in Denodo.

Environments: Salesforce Platform, XMLSpy Works, MS Visio, Microsoft Visual Studio, Informatica, SSIS, Rational Team Concert (RTC), Jira, Denodo


Johns Hopkins Hospital, Baltimore, MD Oct 2015 Nov 2017
Senior Data Architect/Modeler
Responsibilities:
Participated in JAD sessions, gathered information from Business Analysts, end users and other stakeholders to determine the requirements.
Translated the business requirements into workable functional and non-functional requirements at detailed production level using Workflow Diagrams, Sequence Diagrams, Activity Diagrams and Use Case Modelling.
Used HL7 to improve care delivery, optimize workflow, reduce ambiguity and enhance knowledge transfer among all stakeholders.
Involved in translating business requirements into data requirements.
Used data vault modeling method which was adaptable to the needs of this project.
Created and maintained Logical Data Model (LDM) for the project which included documentation of all entities, attributes, data relationships, referential integrity, domains, codes, business rules, glossary terms, etc.
Created business requirement documents and integrated the requirements and underlying platform functionality.
Maintained data model and synchronized it with the changes to the database.
Involved in the modeling and development of Reporting Data Warehousing System.
Created Data Lake as a single source of truth in Hadoop which consisted all data from various sources and in various data formats.
Used ETL tools to extract, transform and load data into data warehouses from various sources like relational databases, application systems.
Developed stored procedures and triggers, packages, functions and exceptions using PL/SQL
Involved with all the phases of Software Development Life Cycle (SDLC) methodologies throughout the project life cycle.
Environment: HL7 Version 2.5, Erwin, IBM DB2, Netezza 7,Teradata, Hadoop, MS Visio, Oracle 11g, IBM Infosphere Fast Track Client, IBM Data Stage, Toad, Power BI
State of NJ Trenton New Jersey May 2013 Aug 2015
Data Modeler
Responsibilities:
Facilitated logical JAD sessions to discuss and gather business requirements for all assigned projects. This process included scheduling meetings with the business and application groups, documenting agendas and minutes.
Participated in data analysis and data dictionary and metadata management - Collaborating with business analysts, ETL developers, data quality analysts and database administrators.
Designed Order provisioning business processes, target applications architecture and target data model as well as developed relational and dimensional model
Reverse engineered the reports and identified the Data Elements (in the source systems), Dimensions, Facts and Measures required for reports.
Validated and maintained the enterprise-wide logical data model for Data Staging Area.
Worked at conceptual/logical/physical data model level using Erwin according to requirements.
Aided and verified DDL implementation by DBAs, corresponded and communicated data and system specifications to DBA, development teams, managers and users.
Managed the meta-data for the Subject Area models for both Operational & Data Warehouse/Data Mart applications.
Conducted data profiling, qualities control/auditing to ensure accurate and appropriate use of data.
Used Data Stage to extract data from relational databases, various file formats. Cleansed and transformed the data before loading it to target database.

Environment: ERWIN, ER Studio, Oracle 10g, DB2 10.1, SAP Business Objects Data Services 3.X, Tableau, Teradata, Hadoop, Informatica, SQL Server

Independence Blue Cross. Philadelphia, PA Aug 2011 Apr 2013
Sr. Data Modeler
Responsibilities:
Conducted JAD sessions, gathered information from Business Analysts, SQL Developers, end users and stakeholders to determine the requirements or various systems.
Created and documented the models, data Workflow Diagrams, Sequence Diagrams, Activity Diagrams and field mappings of all existing system.
Proficient in using ERwin Reverse engineering, Forward engineering and Complete Compare functions to create, updated and resolve issues with existing data models.
Created and implemented ERwin Naming Standards and Domain in the design of creation of logical and physical data models in accordance with the company s standards.
Worked with other Data Modeler, DBAs and Developers in my projects using Microsoft Team Foundation Server to ensure that every aspect of the project was effectively considered and completed on a timely manner.
Prepared and presented graphical representation forms of Entity Relationship Diagrams with elicit more information.
Created and maintained, conceptual, logical and physical data models for data vault data warehouse.
Defined the information required for management and business intelligence purposes using Erwin Data Modeler.
Worked with the DBAs and SQL Developers to deploy and maintain the developed physical data models.
Used Erwin data modeler to design and implement the company s health care domains
Generated ad-hoc SQL queries using joins, database connections and transformation rules to fetch data from the source and SQL Server database systems.
Environment: Erwin 9.64, MS SQL Server 2012, 2014, ServiceNow, Sales Force, MS Visio, Microsoft Visual Studio 2012, 2013, Microsoft Team Foundation Server, Data Vault


Sanofi, Bridgewater, NJ Oct 2009 July 2011
Sr. Data Analyst/Modeler
Responsibilities:
Attended and participates in information and requirements gathering sessions.
Translated business requirements into working logical and physical data models for Staging, Operational Data Store and Data marts applications.
Performed gap analysis and dependency analysis for current & future systems.
Designed Star Schema Data Models for Enterprise Data Warehouse using Power Designer.
Created Mapping documents for Staging, ODS & Data Mart Layers.
Created and maintained Logical & Physical Data Models for the project. Included documentation of all entities, attributes, data relationships, primary and foreign key structures, allowed values, codes, business rules, glossary terms, etc.
Validated and updated the appropriate Models to process mappings, screen designs, use cases, business object model, and system object model as they evolved and changed.
Created Model reports including Data Dictionary, Business reports.
Generated SQL scripts and implemented the relevant databases with related properties from keys, constraints, indexes & sequences.
Performance tuning

Environment: Power Designer 11, Oracle 11g, Informatica, Toad

Fiserv Inc., Cherry Hill NJ May 2008 Aug 2009
Data Analyst/Data Modeler
Responsibilities:
Reviewed functional requirements and use cases to determine the necessary data requirements.
Participated in creating realistic project plans with detailed tasks and ensuring their timely execution.
Reversed engineered current system data model and subdivided it into work streams for analysis and generated data source to target mapping documents.
Created process flows and data flow diagrams of the current and future systems.
Performed dependency and gap analysis for data structure changes on other downstream applications and impact of new data elements on existing downstream processes.
Designed and produced logical and physical data models for the financial platform and other in-house applications running on Oracle databases.
Developed ETL procedures for moving data from source to target systems.
Worked closely with both functional and technical team on the creation of data models which can seamlessly integrate with existing data structures on multiple Oracle databases integrated by database links.
Generated Surrogate Key for the dimensions in the fact table for indexed and faster access of data.
Maintain database standards documents example Naming standards document, Data dictionary to ensure consistency.
Environment: ERWIN 4, 7, MS Visio, Toad, Oracle 10g, PL/SQL, SQL*Plus, Toad, Data Stage
Keywords: cprogramm business intelligence sthree active directory information technology microsoft procedural language Maryland New Jersey Ohio Pennsylvania Texas Virginia

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)
[email protected];4532
Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: