Hi, I am
Soham Sonar

I'm a software engineer who loves turning bold ideas into powerful applications.

Specialized in Python, Machine learning, and Large Language Models, I build scalable, intelligent systems that solve real world problems. Currently, my passion is fueled by the exciting world of Agentic AI, building with LangChain, LangGraph, and MCP servers to push the boundaries of what AI agents can do.

Always curious. Always building.

I enjoy learning new skills and implementing them!

View Resume
Scroll down
SS

Replace with your image

Soham Sonar
{}
</>

About Me

Get to know me better

My interests primarily lies in AI/Machine Learning and Python Software Development, with a heart that beats for innovation and a mind wired for code.

Currently, I am working as a Research Engineer at Gnosis Research Center at the Illinois Tech. I have experience working at multiple startups, Vosyn AI, Wolfizer Technologies, and an Information Technology company- Hexaware Technologies.

Recently, I have built a great interest in developing AI applications using LLM's. You can check some of the applications I made below.

My hobbies include Playing Chess and competing in tournaments, Trekking and Weightlifting.

Education

Master of Computer Science

August 2023 - May 2025

Illinois Institute of Technology, Chicago, IL

GPA: 3.50/4.00

Bachelor of Computer Engineering (Honors in Data Science)

August 2018 - July 2022

Savitribai Phule Pune University

GPA: 3.55/4.00

Work Experience

My professional journey

Research Assistant

February 2025 - Present

Developed agentic AI platform leveraging multi agent orchestration, LLM fine tuning, and conversational AI to automate end-to-end workflows across 40+ node clusters, enabling autonomous task execution and intelligent workflow coordination.

Conducted research on testing LLM based applications development (cursor, claude), building risk assessment frameworks and evaluating best practices to ensure robustness, reliability, and compliance in enterprise scale AI systems.

Developed a scalable AI/ML pipeline leveraging Hadoop HDFS for distributed data ingestion and Spark MLlib for model training, processing over 100 TB of data, slashing feature engineering time by 50%.

Enhanced the performance of open source projects (IOWarp, Chronolog), by integrating an intuitive natural language assistant for data analytics and AI driven workflows, reducing average data retrieval latency by 40%.

Accelerated containerized deployment of HPC applications, cutting setup time by 15%, by leveraging Docker, Jarvis-CD and Linux kernel tuning for Cluster Computers and scalable cloud based environments (Chameleon Cloud).

Featured Projects

Showcasing my most impactful work

Enterprise IO Automation Framework (IOWarp MCPs)
Currently Active

Enterprise IO Automation Framework (IOWarp MCPs)

2024 - Present

Led the development of the Model Context Protocol (MCP) server framework, including Pandas, Parquet, Plot and HDF5 MCP servers, to automate I/O and filesystem workflows for local and cloud environments.

Key Achievements:

  • Designed custom LLM client using Google Gen AI SDK
  • Coordinates 120+ simulation pipelines
PythonMCPGoogle Gen AI SDKPandas+2
Intelligent Security Operations Center (SOC)

Intelligent Security Operations Center (SOC)

July - August 2025

Built a hybrid log classification system and transformed it into enterprise grade SOC platform using ensemble ML (BERT + Groq/Llama 3.1) with real-time threat detection and event correlation.

Key Achievements:

  • MCP based Agentic AI framework orchestration
  • Slack, JIRA, and Grafana integrations
PythonBERTGroqLlama 3.1+4
ChronoAI

ChronoAI

April - June 2025

Engineered a Python inference pipeline to log LLM prompts and responses into ChronoLog, and built a Model context protocol server for context based retrieval and cross platform communication.

Key Achievements:

  • 50K+ real-time LLM conversations captured
  • End-to-end LLM logging and retrieval
PythonChronoLogMCPLLM APIs+1
GitHub Analytics10K+ IssuesIssuesForecastsTrends

GitHub Repo Insights and Forecasting Tool

January - March 2025

Created and hosted a forecasting tool analyzing over 10,000 GitHub issues and metrics, integrating GitHub API to design data pipelines and deliver actionable insights.

Key Achievements:

  • 10,000+ GitHub issues analyzed
  • Machine learning forecasting models
PythonReactFlaskElasticsearch+2
PUBSUBSUBPUBPUBSUBSUBPUBPUBSUBHUBP2P Network99.9% Uptime1M+ TopicsHypercube Topology

P2P Publisher Subscriber System

October - December 2024

Built a fault-tolerant P2P publisher-subscriber system with 99.9% availability using topic replication in a hypercube topology.

Key Achievements:

  • 99.9% system availability
  • 30% performance optimization
LinuxP2P NetworksHypercube TopologyAsync I/O+1
Disease Prediction and Medical Recommendation System

Disease Prediction and Medical Recommendation System

March - May 2024

Implemented a machine learning-based system using a Random Forest model, with accurate disease predictions and built a Flask web application providing tailored medication, diet, and workout recommendations.

Key Achievements:

  • Random Forest ML model for accurate predictions
  • Flask web application with intuitive UI
Machine LearningFlaskRandom ForestData Analysis+1

GitHub Activity

My coding journey visualized

Contribution Graph

Daily contributions over the past year

500+
Contributions
This Year
25+
Repositories
Public Projects
180+
Commit Streak
Longest

Recent Achievements

Latest research accomplishments and recognitions

Page 1 of 3
Research Conference

ChronoLog & AI: A Scalable, Collaborative Solution for LLM Conversation Logging & Retrieval

Peer Reviewed Poster Presentation

Authors:Soham Sonar, Jaime Cernuda, Dr. Anthony Kougkas and Dr. Xian-He Sun

Problem Statement

There is a widespread use of Large Language Models (LLMs), with millions of LLM interactions occurring every second on cloud platforms, fueling demand for local and HPC cluster based models. Cloud providers capture vast logs but offer no AI/HPC centric logging, nor cross chat or cross-user references needed for collaborative workflows.

Solution

ChronoLog delivers a scalable, high throughput distributed storage with precise physical timestamps, multi-user access, and fast time range reads for on-premise and cluster environments, ideal for storing fast, structured AI conversation data.

Impact

Presented to a large group of scientists, professors, and students at SSDBM, showcasing innovative solutions for AI conversation logging in distributed systems.

Tags:
AI/MLDistributed SystemsHPCResearchMCP
ChronoLog & AI: A Scalable, Collaborative Solution for LLM Conversation Logging & Retrieval - Image 1
1 / 3
Event Details

Venue: 37th International Conference on Scalable Scientific Data Management (SSDBM 2025), Columbus, Ohio

Type: Research Conference

Date: 2025

Interested in Research Collaboration?

I'm always open to discussing research opportunities in AI, distributed systems, and scalable data management. Feel free to reach out to explore potential collaborations.

Get in Touch

Skills & Expertise

Technologies I work with

Programming Languages

Python
Java
C++
R
HTML
CSS

AI & Machine Learning

GenAI/LLMs
Machine Learning
Agentic AI
Autogens
MCP Servers
RAG

Web & Backend

Django
Node.js
React
REST APIs
SaaS
Next.js

Database

MySQL
PostgreSQL
Firebase
MongoDB
Redis
Elasticsearch

Cloud & Big Data

AWS
GCP
Azure
Docker
Kubernetes
Hadoop
Spark

Tools & Version Control

GitHub
Git
Jira
VS Code
Jupyter
Postman

Courses & Certifications

Academic achievements and professional development

Graduate Coursework

Design and Analysis of Algorithms (CS535)
Advanced Operating Systems (CS550)
Cloud Computing (CS553)
Advanced Database Organization (CS525)
Machine Learning (CS584)
Natural Language Processing (CS585)
Data Preparation and Analysis (CSP571)
Big Data Technologies (CSP574)

Professional Certifications

100 Days of Python Bootcamp
Java Data Structures and Algorithms
Google Cloud Architecture
Python by University of Michigan

Get In Touch

I'm currently looking for Full Time Job opportunities to work as a Software Engineer, AI Engineer, or Machine Learning Engineer for 2025. Open to all locations in the United States/remote, my inbox is always open.

I have a genuine interest in connecting with new people and exchange innovative thoughts and concepts. Please feel free to reach out to me!

United States (Open to all locations)
soham.sonar427@gmail.com

Designed & Built by Soham Sonar