hugging-face-evaluation
🔌Pluginhuggingface/skills
Official Hugging Face skills defining AI/ML tasks like dataset creation, model training, and evaluation. Interoperable with Claude Code, OpenAI Codex, Gemini CLI, and Cursor using the standardized Agent Skill format.
Overview
Hugging Face Evaluation is a plugin from Hugging Face's official skills repository that provides AI/ML task definitions for model evaluation. The skills are interoperable with all major coding agent tools including OpenAI Codex, Claude Code, Gemini CLI, and Cursor, following the standardized Agent Skill format with support for multiple installation methods.
Key Features
- Cross-Agent Interoperability - Compatible with Claude Code, OpenAI Codex, Gemini CLI, and Cursor, with additional integrations for Windsurf and Continue in development
- Standardized Skill Format - Follows the Agent Skill format with YAML frontmatter, SKILL.md files, and AGENTS.md fallback for agents that do not support skills natively
- Multiple Installation Methods - Install via Claude Code plugin marketplace, Codex AGENTS.md auto-detection, or Gemini CLI extensions with local or URL-based setup
- ML Task Coverage - Skills cover dataset creation, model training, evaluation, and other core Hugging Face Hub operations
- Gemini Extension Support - Includes
gemini-extension.jsonfor native integration with the Gemini CLI
Who is this for?
This skill is designed for ML engineers and data scientists who need AI-assisted guidance on evaluating machine learning models using Hugging Face tools and infrastructure. It is particularly useful for teams that work across multiple AI coding agents and want consistent evaluation workflows regardless of which tool they use.
Part of
huggingface-skills
Installation
/plugin marketplace add huggingface/skills/plugin install hugging-face-evaluation@huggingface-skillsMore from this repository10
Agent Skills for AI/ML tasks including dataset creation, model training, evaluation, and research paper publishing on Hugging Face Hub
Official Hugging Face skills defining AI/ML tasks like dataset creation, model training, and evaluation. Interoperable with Claude Code, OpenAI Codex, Gemini CLI, and Cursor using the standardized Agent Skill format.
A skill for the Hugging Face Hub CLI (hf), enabling downloading, uploading, and managing models, datasets, spaces, buckets, repos, papers, and jobs on the Hugging Face Hub. Replaces the deprecated huggingface-cli.
Official Hugging Face skills defining AI/ML tasks like dataset creation, model training, and evaluation. Interoperable with Claude Code, OpenAI Codex, Gemini CLI, and Cursor using the standardized Agent Skill format.
Official Hugging Face skills defining AI/ML tasks like dataset creation, model training, and evaluation. Interoperable with Claude Code, OpenAI Codex, Gemini CLI, and Cursor using the standardized Agent Skill format.
Official Hugging Face Gradio skill for building Python web UIs and ML demos — covers the `Interface` high-level wrapper, `Blocks` low-level layout with explicit event wiring, `ChatInterface` for chatbots, core component signatures (Textbox, etc.), streaming inputs/outputs, custom CSS/JS, and sharing your app.
Explores, queries, and extracts data from any Hugging Face dataset via the Dataset Viewer REST API and npx tooling with zero Python dependencies — split/config discovery, row pagination, text search, filtering, SQL via parquetlens, and CLI upload.
Train or fine-tune language models using TRL on Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for local deployment.
Hugging Face agent skills for AI/ML tasks like dataset creation, model training, and evaluation, interoperable with major coding agent tools.
Part of the official Hugging Face Skills collection, this skill teaches AI coding agents how to work with Transformers.js for running machine learning models directly in the browser and Node.js environments.