Machine Learning Code GitHub

Post-Completion Learning for Language Models

Our code is based on open-r1, with our customized Trainer for mixed SFT+GRPO training. Some other updates focus on the white-box RL (reward function design) and post-completion training (replacement ...

IEEE

Aligning Crowd-Sourced Human Feedback for Reinforcement Learning on Code Generation by Large Language Models

Abstract: This paper studies how AI-assisted programming and large language models (LLM) improve software developers' ability via AI tools (LLM agents) like Github Copilot and Amazon CodeWhisperer, ...

3don MSN

I'm 18 and cofounded an AI startup with teens around the world who I've never met in-person. I had no network, so I built one online — here's how.

Alex Yang details founding an AI startup with other high schoolers worldwide to improve Alzheimer's diagnostics.

Visual Studio Magazine

Show inaccessible results

Post-Completion Learning for Language Models

Aligning Crowd-Sourced Human Feedback for Reinforcement Learning on Code Generation by Large Language Models

I'm 18 and cofounded an AI startup with teens around the world who I've never met in-person. I had no network, so I built one online — here's how.

Microsoft Quietly Kills IntelliCode as AI Strategy Shifts to Subscription Copilot

skytells-research/stock-risk-analyzer

Blending Humanistic Inquiry and Technology, Carnegie Mellon Leads a New Era of Cultural Study and Research

Hacking as a Prompt: Malicious LLMs Find Users

Variable-Length Feedback Codes via Deep Learning