April 2024 - present

Walmart Global Tech

Software Engineer III

November 2023 - April 2024

Intuit (payroll Altimetrik)

Senior Engineer

November 2021 - September 2022

Societe Generale

Specialist Software Engineer

June 2020 - November 2021

Mindtree

Senior Engineer

January 2019- June 2020

Mindtree

Engineer

October 2018 - December 2018

Mindtree

Trainee Software Engineer




Walmart Global Tech

Software Engineer III

April 2024 - present

Technologies:




Skills:




Achievements:

☆ Led enterprise-wide migration of Apache Spark applications from version 2.x to 3.x, ensuring seamless transition while improving performance by 25% and maintaining functionality with reduced cloud computing costs.
☆ Automated financial reconciliation system integrating data sources to validate $1.5B+ in daily transactions, reducing reconciliation time from 7 to 2 days and eliminating manual verification efforts.
☆ Architected and implemented data archival solution between GCP Cloud Storage and PostgreSQL, automating archival workflows enabling compliance with data retention policies, resulting in a 30% reduction in database storage costs.
☆ Performed complex database migration projects, successfully transferring 25 GB of data from MySQL to PostgreSQL while ensuring zero data loss and minimal downtime.
☆ Achieved increased operational efficiency by employing Bash and Python scripting languages to automate repetitive tasks, streamline job scheduling processes and manage clusters on GCP and WCNP (Walmart Cloud native Platform).

Responsibilities:

  • Data Pipeline Development and Maintenance
    • Design, develop, and maintain scalable data pipelines for financial data processing
    • Implement business logic for data transformation and enrichment
    • Ensure data quality and accuracy through validation checks
    • Monitor pipeline performance and SLAs
    • Troubleshoot and resolve data processing issues in production environments
  • Financial Data Integration and Reconciliation
    • Coordinate with multiple teams to understand source data formats and business requirements
    • Implement data integration patterns for various vendor sources
    • Develop automated reconciliation processes for financial transactions
    • Maintain data lineage and documentation
    • Ensure compliance with financial reporting standards
  • Infrastructure and Platform Management
    • Manage and optimize GCP resources for cost and performance
    • Implement and maintain database systems (PostgreSQL, MySQL)
    • Configure and optimize Apache Spark clusters
    • Set up and maintain workflow orchestration tools (Airflow, Jenkins)
    • Implement security best practices and vulnerability fixes
  • Performance Optimization and Testing
    • Profile and optimize Spark jobs for better performance
    • Develop and maintain automated testing frameworks
    • Conduct performance testing and benchmarking
    • Implement monitoring and alerting systems
    • Create and maintain test data sets and testing environments
  • Technical Leadership and Collaboration
    • Provide technical guidance and mentorship to team members
    • Collaborate with business stakeholders to understand requirements
    • Work with SAP team for financial reporting integration
    • Participate in architecture discussions and technical design reviews
    • Document technical solutions and maintain knowledge base

Intuit (payroll Altimetrik)

Senior Engineer

November 2023 - April 2024

Technologies:




Skills:




Achievements:

☆ Received “Team” Award from Altimetrik, for presenting POC on Gen AI.
☆ Developed an advanced application for infrastructure monitoring using cutting-edge AI and machine learning techniques, addressing critical challenges in system reliability, error detection, and debugging.
☆ Integrated Natural Language Processing (NLP) techniques to extract log patterns, compare against known error signatures, and interpret new logs in the context of historical failure models.
☆ Implemented a log anomaly detection model leveraging the GPT-3.5 Turbo from OpenAI, fine-tuned on a custom dataset of over 2000 real-world Spark application logs.
☆ Designed an intuitive web interface using Plotly Dash and Dash Components, enabling users to input raw unstructured logs and receive real-time anomaly detection, root cause analysis, and suggested fixes.

Responsibilities:

  • Converted complex Informatica workflows to Apache Spark applications, enabling distributed computing and enhancing data processing capabilities for large-scale financial and supply chain data workloads.
  • Orchestrated data pipelines to load data from QuickBase to AWS S3, processed data using Apache Spark applications on AWS EMR, and loaded the processed data back to QuickBase, streamlining data integration between different systems.
  • Leveraged Hive databases to load and manage finance data on AWS S3, maintaining data integrity, accessibility, and compliance with financial reporting standards.
  • Conducted comprehensive like-to-like testing and validation, comparing traditional and cloud-based systems to ensure accurate data migration, seamless integration, and adherence to data quality standards.

Societe Generale

Specialist Software Engineer

November 2021 - September 2022

Technologies:


Skills:


Achievements:

☆ Developed fraud detection and financial data analysis applications on a private cloud environment, resulting in improved transaction security on SWIFT payment system.
☆ Upgraded banking data to ISO20022 standards in compliance with new regulations using Scala-based Spark application, resulting in a 40% reduction in processing time and improved accuracy.
☆ Implemented the use of Hive database for distributed data using advanced compression file formats, resulting in an 87% increase in storage efficiency.
☆ Created a suite of 55 reports within a span of two months to serve as a data source for PowerBI visualization, enabling valuable insights to be derived from a vast pool of data stored in a data lake.
☆ Demonstrated resiliency and structural consistency by performing disaster recovery tests and achieving 92% testing benchmarks on Hadoop production clusters.

Responsibilities:

  • Worked on developing spark applications with Scala using Datasets in the domain of Correspondence Banking sector using private on-premise cloud infrastructure.
  • Worked with Hive databases to store transformed data on external tables with partitions having various file formats including parquet, ORC in compressed format to save space up to 87%
  • Used Oozie as a scheduling tool to run the spark jobs on specific intervals to automate report generation and eliminating regular manual intervention.
  • Developed spark applications to migrate existing use cases from legacy Java applications to handle large datasets [Big Data] that are scalable for future use.
  • Implemented the upgradation of Banking data to ISO20022 standards using spark applications while migrating them to Big Data environment to adhere to new banking regulations.
  • Developed applications to generate a family of reports to be fed into custom UI/ PowerBI reports to get insights from the vast pool of data lake.
  • Involved in performing Disaster Recovery tests on the functional module containing multiple spark jobs to test data replication, structural integrity, test plan creation, and SOP to be followed during a disaster.

During my experience at Société Générale in corresponding banking, I built applications to detect fraudulent transactions and money laundering activities in high-risk countries through the Swift payment system and generated reports flagging unusual transactions, I had a host of responsibilities that needed technical knowledge in the big data domain with skills in data cleaning, data analysis, building and deploying applications on cloud environments. As a Specialist Software Engineer, I analysed the data to be received in the system, automate the data quality check before ingestion into a data lake, and develop a spark application from the data received in HDFS distributed file system to design and build a functional data pipeline and finally present the data in the form of tables, reports, and visualisations. In addition, I have successfully designed and executed disaster recovery procedures for Spark applications within a private cloud environment, ensuring the resilience and uninterrupted operation of critical data systems.


Mindtree

Senior Engineer

June 2020 - November 2021

Technologies:


Skills:


Achievements:

☆ Received “A-Team” Award from Mindtree, six times for collaborative team spirit.
☆ Led the module migration from Mainframe to Cloud using Amazon S3, CouchDB and Postgres, resulting in a 20% reduction in data processing time with improved scalability.
☆ Created real-time data streaming applications using Apache Spark Streaming and Kafka that reduced data processing latency by 50%, enabling real-time decision-making for the business.
☆ Designed automated end-to-end data pipelines, leveraging 4 years of historical business data to validate data flow between components and eliminating manual intervention.

Responsibilities:

  • Involved in End-to-End Data Integration from Source to Tables and UI.
  • Involved in Business rule and Data Integrity analysis for data transformation.
  • Experience in Data transformation using Scala, Python and SQL Scripts.
  • Experience in working with PostgreSQL, Couchbase.
  • Worked on handling 4+ years of Business Data to ensure the accuracy.
  • Involved in developing Modules having multiple Spark jobs to integrate multiple Data Flows.
  • Involved in continuous interaction with the Client providing updates and taking feedback on the Modules using Confluence for project collaboration.
  • Responsible for Module Level job integration, worked with Legacy system to meet the Data accuracy requirements with Legacy and current (cloud) system.
  • Involved in collecting Software Requirement Specifications at the Design Phase.
  • Preparation of Test Scenarios based on the functionality.
  • Worked on validating the Data flows between multiple spark jobs to execute End to End Data flow to table/UI.

My experience in Mindtree as a Senior Engineer exposed me to explore the Travel and Hospitality industry which provided me an opportunity to develop with semi-structured and unstructured data with varying file formats. This allowed me to get familiarize with different types of SQL and NoSQL databases. I applied my learning skills to develop spark applications in areas of Inventory Optimization, Demand and Supply, and Hotel Room Booking Forecasts. One of my notable achievements during this time was successfully transitioning the analytical system from a Legacy Mainframe network to a cloud-based AWS infrastructure within just three years and I was awarded six times by the team during this period for my contribution.


Mindtree

Engineer

January 2019 - June 2020

Technologies:


Skills:


Achievements:

☆ Ensured data accuracy compliance across Legacy and cloud systems, leveraging the PowerBI visualization tool to achieve a targeted accuracy level exceeding 90%.
☆ Reduced testing time up to 60% by developing automated test scenarios and applying oozie orchestration flows to execute in Amazon EMR cluster.

Responsibilities:

  • Implemented Spark Scripting using Scala and Python to handle Big Data and complex data transformations.
  • Worked on executing the Spark jobs in AWS Cloud environment with Oozie Orchestration flows.
  • Attended Daily Scrum Meetings to provide continuous updates on the tasks assigned to Team Leads.
  • Developed SQL scripts using Spark for handling different data sets.
  • Interacted with the Business Analysis Team and Technical Support Team to cover the requirements and optimally design the data flow.
  • Used various file formats such as text, JSON, CSV, and Parquet for input data requirements to Spark jobs.
  • Had hands-on experience with Hadoop and associated technologies such as MapReduce, Spark, Pig, and Hive.
  • Had domain knowledge in the Travel, Transport, and Hospitality Industry.

Mindtree

Trainee - Software Engineer

October 2018 - December 2018


Responsibilities:

  • Trained on wide range on technologies with specialization in Big Data.
  • Trained on Big Data technologies such as Hadoop usage (HDFS, YARN), Hive, HBase, Kafka, Spark etc.
  • Trained on AWS cloud services such as EC2, S3, EMR, RDS, SNS, Route53, Snowball, IAM, Glacier etc.