HUST · THUNLP · Beijing Zhongguancun Academy

Cheng Yin

Embodied AI / VLA / Robot Learning

I work where reasoning systems meet physical action: VLA models, robot learning, and the research machinery that makes them improve.

Explore work Google Scholar Hugging Face

Research hotline Better Call Cheng Yin Cheng Goodman & Associates For robots stuck between seeing, reasoning, and acting. Say My Name

Current role Ph.D. training across HUST, Tsinghua THUNLP, and Beijing Zhongguancun Academy.

Research focus Embodied AI, VLA, robot learning, reinforcement learning.

Open-source footprint 50 personal repos, 900+ total GitHub stars including DeepThinkVLA, 29 HF models, 9 datasets.

Research

A practical route from visual understanding to real-world action.

Vision-Language-Action

Interfaces between multimodal backbones and action policies, including action heads, decoding strategies, and cross-embodiment evaluation.

Reasoning for Robots

Making chain-of-thought useful for action by aligning reasoning with success signals, not just generating explanations.

Research Automation

Benchmark-driven tooling for faster training, evaluation, comparison, and reproducible iteration.

Work Index

A growing archive of projects, papers, and tools.

DeepThinkVLA is one visible node in a broader research line around embodied intelligence, robot reasoning, and automated research systems.

Project · Paper

DeepThinkVLA

A Vision-Language-Action project studying when reasoning helps robot action. The repository belongs to Cheng Yin and was later transferred to OpenBMB for organization-level hosting.

GitHub arXiv HF Collection

LIBERO: 97.0%
LIBERO-Plus: 79.0%
RoboTwin 2.0: 59.3%

Selected Work

Research systems, resources, and agent tooling.

Open Source Map

Awesome-Embodied-AI

A maintained map of embodied AI resources: surveys, VLA models, datasets, simulators, humanoids, robot learning, and safety.

Open project

AutoResearch

autoresearch-qwen

Autonomous Qwen3-VL training-code research on the official DocVQA benchmark, with NVIDIA multi-GPU and Apple Silicon paths.

Open project

Vision Product

Vision Intelligence

A project site exploring visual perception, spatial understanding, framing decisions, and real-world device execution.

Visit site

Agent Safety

dualkey-agent

Approvals, deterministic policy, and signed receipts for AI agent actions.

Open project

Open Source

Open systems with ownership carried across communities.

Hugging Face · yinchenghust

Models & Robot Data

DeepThinkVLA checkpoints, LIBERO CoT data, and robot-learning assets are published on Hugging Face.

deepthinkvla_baseRobotics · 3B
deepthinkvla_libero_cot_sftSFT checkpoint
deepthinkvla_libero_cot_rlRL checkpoint
libero_cotEmbodied CoT dataset

GitHub · wadeKeith / OpenBMB

Code, Benchmarks & Agents

Code repositories cover VLA, embodied AI curation, autonomous research, robot evaluation, and agent governance. DeepThinkVLA is counted here even though it now lives under OpenBMB.

OpenBMB/DeepThinkVLACreator repo · 515 stars
autoresearch-qwen199 stars
Awesome-Embodied-AI197 stars
interactive-robot-eval-wrapperRobot eval

ModelScope · keithyc

Datasets & Model Weights

The ModelScope profile is a domestic distribution channel for many open robot-learning datasets, embodied CoT data, LeRobot/OpenX assets, and model weights.

36 public datasetskeithyc profile
deepthinkvla_basemodel weights
libero_cotembodied CoT data
robotwin_processed_datarobot data
bridge_lerobotLeRobot / OpenX
fractal_lerobotLeRobot

Publications

All papers currently listed on Google Scholar.

DeepThinkVLA: Enhancing Reasoning Capability of Vision-Language-Action Models

Cheng Yin, Yankai Lin, Wang Xu, Sikyuen Tam, Xiangrui Zeng, Zhiyuan Liu, Zhouping Yin · arXiv 2025 · 12 citations on Google Scholar

arXiv

An improved data-driven predictive optimal control approach for designing hybrid electric vehicle energy management strategies

Cheng Yin, Xiangrui Zeng, Zhouping Yin · Applied Energy 375, 123984 · 2024 · 5 citations on Google Scholar

DOI

Does Optimal Control Always Benefit from Better Prediction? An Analysis Framework for Predictive Optimal Control

Xiangrui Zeng, Cheng Yin, Zhouping Yin · arXiv 2024 · 1 citation on Google Scholar

arXiv

StarVLA: A Lego-like Codebase for Vision-Language-Action Model Developing

StarVLA Community · arXiv 2026

arXiv

A Method to Improve the Performance of Reinforcement Learning Based on the Y Operator for a Class of Stochastic Differential Equation-Based Child-Mother Systems

Cheng Yin, Yifan Chen · arXiv 2023

arXiv

Zhihu Writing

Public notes from research and open-source building.

On Zhihu, I write about embodied AI, VLM agents, automated research, and open-source resources. The profile currently has 44 articles, 25 answers, 9 columns, and 858 followers.

Open Zhihu profile

44Articles

25Answers

1.3KUpvotes

3.1KFavorites

Pinned note

An agent that can tune VLMs by itself

A public note about AutoResearch-Qwen: an agent loop that reads repositories, edits training code, runs experiments, evaluates results, and keeps score-improving changes.

AgentVLMQwenSelf-improving AI

Pinned note

A living knowledge base for embodied AI

A note introducing Awesome-Embodied-AI as a collaborative research map for embodied intelligence, large models, humanoid robots, and multimodal systems.

Embodied AILarge ModelsHumanoid Robots

Life

Curiosity does not stop at the lab door.

Travel, architecture, light, water, and long walks keep me sensitive to the physical world I want intelligent systems to understand.

Sunny street with flowers — Walking through cities.

Desert overlook — Chasing open horizons.

White architecture and sky — Finding calm in geometry.

Waterfront travel moment — Blue water, bright days.

A short travel clip from the same photo collection.

Contact

Open to conversations around embodied intelligence, VLA, robot learning, edge vision models, and research automation.

yinchenghust@outlook.com GitHub Google Scholar Zhihu ORCID