Hello! 👋

I'm Ido Pinto

AI/ML Research Engineer

I'll be at ICML 2026 in Seoul 🇰🇷. If you're there too, let's connect!btw, I'm open to work :)

About Me

M.Sc. in CS from HUJI, advised by Prof. Guy Katz. Interested in LLMs and their applications to formal methods and program verification.

Through my research I've gotten hands-on with the full ML lifecycle. I built a scalable data pipeline for loop invariant generation (accepted @ ICML 2026!). I care a lot about evals, observability, and reproducibility.

These days I'm into RL, LLM inference optimization, and building reliable and useful AI agents. I like bridging research to production, turning ideas that work in a notebook into systems that actually ship. If anything interests you, reach out!

Education

Master of Science in Computer Science

The Hebrew University of Jerusalem

2024 - 2026

Advised by Prof. Guy Katz; GPA 97.7 (Cum Laude)
First author of "Not All Invariants Are Equal" accepted for poster presentation @ ICML (Seoul, July 2026)
Delivered oral presentation of thesis research @ the RobustifAI Consortium (Siemens, Belgium, Feb 2026)

Relevant coursework

Advanced Machine Learning
Bayesian Machine Learning
Advanced Natural Language Processing
Information Theory
Deep Learning & NLP for Accelerating Science

Bachelor of Science in Computer Science

The Hebrew University of Jerusalem

2020 - 2024

GPA 87.2

Relevant coursework

Probability & Statistics
Data Structures & Algorithms
Operating Systems
Communication Networks
Object-Oriented Programming
Machine Learning
Natural Language Processing
Image Processing

Publications

Not All Invariants Are Equal: Curating Training Data to Accelerate Program Verification with SLMs

ICML 2026 (Poster)

Ido Pinto, Yizhak Yisrael Elboher, Haoze Wu, Nina Narodytska, Guy Katz

Introduces WONDA, a data curation pipeline that refines noisy verifier-generated invariants via AST-based normalization and LLM-driven rewriting. Fine-tuning SLMs on this curated data doubles invariant correctness and verified speedup rates; a 4B model matches GPT-OSS-120B utility and approaches GPT-5.2.

Paper Project Page Code Models Data

Projects

Verifier-in-the-Loop LLM Agents for Code Repair

Designed an iterative repair agent pairing a reasoning LLM with a formal verifier on 497 unverified Dafny programs from DafnyBench. Found counterexamples alone did not improve overall repair rates, but combining runs with and without them reached 62.4% (+8 points over either approach).

Report Code Data

BioAspire

Extended the ASPIRE document similarity framework on biomedical retrieval benchmarks. Fine-tuned ModernBERT and gte-Qwen2-1.5B with co-citation contrastive learning and BioNER-based augmentation via ScispaCy and UMLS entity linking.

Report Code Models

DM-ICCL

Novel in-context learning framework combining curriculum learning with semantic similarity retrieval. Categorized demonstrations by difficulty via a diagnostic pipeline, achieving 5.5% accuracy gains on MCQA benchmarks across Llama-3, Gemma-2, and Phi-3.5.

Report Code

Experience

Teaching Assistant (Grader)

The Hebrew University of Jerusalem

2024 - 2026

Evaluated student assignments and exams and handled grade rebuttals across three semesters
- Object-Oriented Programming (Fall 2024)
- Machine Learning (Spring 2025)
- Software Engineering & Communication Networks (Fall 2025)

Skills

AI / ML

ML & Data

NumPyPandasSciPyscikit-learn

NLP & Retrieval

spaCySciSpaCysentence-transformersNLTKFAISS

LLMs & Inference

TransformersvLLMOpenAI APIHugging Face Hub

Training & Fine-Tuning

PyTorchTRLPEFTUnsloth

Agents

OpenAI Agents SDKDSPy

Experiment Tracking

W&BWeave

Computer Vision

OpenCVImage Processing

Languages

PythonCC++JavaSQL

Tools

GitLinuxBashDockerSlurmHydraJupyterZ3