Hi, I am
Soham Sonar
I'm a software engineer who loves turning bold ideas into powerful applications.
Specialized in Python, Machine learning, and Large Language Models, I build scalable, intelligent systems that solve real world problems. Currently, my passion is fueled by the exciting world of Agentic AI, building with LangChain, LangGraph, and MCP servers to push the boundaries of what AI agents can do.
Always curious. Always building.
I enjoy learning new skills and implementing them!
About Me
Get to know me better
My interests primarily lies in AI/Machine Learning and Python Software Development, with a heart that beats for innovation and a mind wired for code.
Currently, I am working as a Research Engineer at Gnosis Research Center at the Illinois Tech. I have experience working at multiple startups, Vosyn AI, Wolfizer Technologies, and an Information Technology company- Hexaware Technologies.
Recently, I have built a great interest in developing AI applications using LLM's. You can check some of the applications I made below.
My hobbies include Playing Chess and competing in tournaments, Trekking and Weightlifting.
Education
Master of Computer Science
August 2023 - May 2025
Illinois Institute of Technology, Chicago, IL
GPA: 3.50/4.00
Bachelor of Computer Engineering (Honors in Data Science)
August 2018 - July 2022
Savitribai Phule Pune University
GPA: 3.55/4.00
Work Experience
My professional journey
Research Assistant
February 2025 - Present
Developed agentic AI platform leveraging multi agent orchestration, LLM fine tuning, and conversational AI to automate end-to-end workflows across 40+ node clusters, enabling autonomous task execution and intelligent workflow coordination.
Conducted research on testing LLM based applications development (cursor, claude), building risk assessment frameworks and evaluating best practices to ensure robustness, reliability, and compliance in enterprise scale AI systems.
Developed a scalable AI/ML pipeline leveraging Hadoop HDFS for distributed data ingestion and Spark MLlib for model training, processing over 100 TB of data, slashing feature engineering time by 50%.
Enhanced the performance of open source projects (IOWarp, Chronolog), by integrating an intuitive natural language assistant for data analytics and AI driven workflows, reducing average data retrieval latency by 40%.
Accelerated containerized deployment of HPC applications, cutting setup time by 15%, by leveraging Docker, Jarvis-CD and Linux kernel tuning for Cluster Computers and scalable cloud based environments (Chameleon Cloud).
Featured Projects
Showcasing my most impactful work

Enterprise IO Automation Framework (IOWarp MCPs)
2024 - Present
Led the development of the Model Context Protocol (MCP) server framework, including Pandas, Parquet, Plot and HDF5 MCP servers, to automate I/O and filesystem workflows for local and cloud environments.
Key Achievements:
- Designed custom LLM client using Google Gen AI SDK
- Coordinates 120+ simulation pipelines

Intelligent Security Operations Center (SOC)
July - August 2025
Built a hybrid log classification system and transformed it into enterprise grade SOC platform using ensemble ML (BERT + Groq/Llama 3.1) with real-time threat detection and event correlation.
Key Achievements:
- MCP based Agentic AI framework orchestration
- Slack, JIRA, and Grafana integrations

ChronoAI
April - June 2025
Engineered a Python inference pipeline to log LLM prompts and responses into ChronoLog, and built a Model context protocol server for context based retrieval and cross platform communication.
Key Achievements:
- 50K+ real-time LLM conversations captured
- End-to-end LLM logging and retrieval
GitHub Repo Insights and Forecasting Tool
January - March 2025
Created and hosted a forecasting tool analyzing over 10,000 GitHub issues and metrics, integrating GitHub API to design data pipelines and deliver actionable insights.
Key Achievements:
- 10,000+ GitHub issues analyzed
- Machine learning forecasting models
P2P Publisher Subscriber System
October - December 2024
Built a fault-tolerant P2P publisher-subscriber system with 99.9% availability using topic replication in a hypercube topology.
Key Achievements:
- 99.9% system availability
- 30% performance optimization

Disease Prediction and Medical Recommendation System
March - May 2024
Implemented a machine learning-based system using a Random Forest model, with accurate disease predictions and built a Flask web application providing tailored medication, diet, and workout recommendations.
Key Achievements:
- Random Forest ML model for accurate predictions
- Flask web application with intuitive UI
GitHub Activity
My coding journey visualized
Contribution Graph
Daily contributions over the past year
Recent Achievements
Latest research accomplishments and recognitions
ChronoLog & AI: A Scalable, Collaborative Solution for LLM Conversation Logging & Retrieval
Peer Reviewed Poster Presentation
Problem Statement
There is a widespread use of Large Language Models (LLMs), with millions of LLM interactions occurring every second on cloud platforms, fueling demand for local and HPC cluster based models. Cloud providers capture vast logs but offer no AI/HPC centric logging, nor cross chat or cross-user references needed for collaborative workflows.
Solution
ChronoLog delivers a scalable, high throughput distributed storage with precise physical timestamps, multi-user access, and fast time range reads for on-premise and cluster environments, ideal for storing fast, structured AI conversation data.
Impact
Presented to a large group of scientists, professors, and students at SSDBM, showcasing innovative solutions for AI conversation logging in distributed systems.
Tags:

Event Details
Venue: 37th International Conference on Scalable Scientific Data Management (SSDBM 2025), Columbus, Ohio
Type: Research Conference
Date: 2025
Interested in Research Collaboration?
I'm always open to discussing research opportunities in AI, distributed systems, and scalable data management. Feel free to reach out to explore potential collaborations.
Get in TouchSkills & Expertise
Technologies I work with
Programming Languages
AI & Machine Learning
Web & Backend
Database
Cloud & Big Data
Tools & Version Control
Courses & Certifications
Academic achievements and professional development
Graduate Coursework
Professional Certifications
Get In Touch
I'm currently looking for Full Time Job opportunities to work as a Software Engineer, AI Engineer, or Machine Learning Engineer for 2025. Open to all locations in the United States/remote, my inbox is always open.
I have a genuine interest in connecting with new people and exchange innovative thoughts and concepts. Please feel free to reach out to me!
Designed & Built by Soham Sonar