-
deepseek-ai/DeepSeek-R1
Text Generation • 685B • Updated • 994k • • 13.1k -
Qwen/Qwen2.5-Coder-32B-Instruct
Text Generation • 33B • Updated • 868k • • 2k -
google/gemma-2-27b-it
Text Generation • 27B • Updated • 409k • 560 -
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Paper • 2201.11903 • Published • 15
Collections
Discover the best community collections!
Collections including paper arxiv:2307.15337
-
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing
Paper • 2505.21600 • Published • 71 -
Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models with Flow Matching
Paper • 2412.17153 • Published • 39 -
Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding
Paper • 2307.15337 • Published • 39 -
DiTFastAttn: Attention Compression for Diffusion Transformer Models
Paper • 2406.08552 • Published • 25
-
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper • 2401.02038 • Published • 65 -
Learning To Teach Large Language Models Logical Reasoning
Paper • 2310.09158 • Published • 1 -
ChipNeMo: Domain-Adapted LLMs for Chip Design
Paper • 2311.00176 • Published • 9 -
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Paper • 2308.09583 • Published • 7
-
Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers
Paper • 2309.08532 • Published • 54 -
Large Language Models as Optimizers
Paper • 2309.03409 • Published • 79 -
Graph of Thoughts: Solving Elaborate Problems with Large Language Models
Paper • 2308.09687 • Published • 7 -
Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks
Paper • 2211.12588 • Published • 3
-
Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models
Paper • 2308.10379 • Published -
Graph of Thoughts: Solving Elaborate Problems with Large Language Models
Paper • 2308.09687 • Published • 7 -
Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding
Paper • 2307.15337 • Published • 39 -
Tab-CoT: Zero-shot Tabular Chain of Thought
Paper • 2305.17812 • Published • 2
-
Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs
Paper • 2310.13961 • Published • 5 -
Diversity of Thought Improves Reasoning Abilities of Large Language Models
Paper • 2310.07088 • Published • 5 -
AutoMix: Automatically Mixing Language Models
Paper • 2310.12963 • Published • 14 -
SAI: Solving AI Tasks with Systematic Artificial Intelligence in Communication Network
Paper • 2310.09049 • Published • 1
-
Iterated Decomposition: Improving Science Q&A by Supervising Reasoning Processes
Paper • 2301.01751 • Published -
Question Decomposition Improves the Faithfulness of Model-Generated Reasoning
Paper • 2307.11768 • Published • 14 -
Contrastive Decoding Improves Reasoning in Large Language Models
Paper • 2309.09117 • Published • 40 -
Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding
Paper • 2307.15337 • Published • 39
-
deepseek-ai/DeepSeek-R1
Text Generation • 685B • Updated • 994k • • 13.1k -
Qwen/Qwen2.5-Coder-32B-Instruct
Text Generation • 33B • Updated • 868k • • 2k -
google/gemma-2-27b-it
Text Generation • 27B • Updated • 409k • 560 -
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Paper • 2201.11903 • Published • 15
-
Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models
Paper • 2308.10379 • Published -
Graph of Thoughts: Solving Elaborate Problems with Large Language Models
Paper • 2308.09687 • Published • 7 -
Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding
Paper • 2307.15337 • Published • 39 -
Tab-CoT: Zero-shot Tabular Chain of Thought
Paper • 2305.17812 • Published • 2
-
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing
Paper • 2505.21600 • Published • 71 -
Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models with Flow Matching
Paper • 2412.17153 • Published • 39 -
Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding
Paper • 2307.15337 • Published • 39 -
DiTFastAttn: Attention Compression for Diffusion Transformer Models
Paper • 2406.08552 • Published • 25
-
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper • 2401.02038 • Published • 65 -
Learning To Teach Large Language Models Logical Reasoning
Paper • 2310.09158 • Published • 1 -
ChipNeMo: Domain-Adapted LLMs for Chip Design
Paper • 2311.00176 • Published • 9 -
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Paper • 2308.09583 • Published • 7
-
Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs
Paper • 2310.13961 • Published • 5 -
Diversity of Thought Improves Reasoning Abilities of Large Language Models
Paper • 2310.07088 • Published • 5 -
AutoMix: Automatically Mixing Language Models
Paper • 2310.12963 • Published • 14 -
SAI: Solving AI Tasks with Systematic Artificial Intelligence in Communication Network
Paper • 2310.09049 • Published • 1
-
Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers
Paper • 2309.08532 • Published • 54 -
Large Language Models as Optimizers
Paper • 2309.03409 • Published • 79 -
Graph of Thoughts: Solving Elaborate Problems with Large Language Models
Paper • 2308.09687 • Published • 7 -
Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks
Paper • 2211.12588 • Published • 3
-
Iterated Decomposition: Improving Science Q&A by Supervising Reasoning Processes
Paper • 2301.01751 • Published -
Question Decomposition Improves the Faithfulness of Model-Generated Reasoning
Paper • 2307.11768 • Published • 14 -
Contrastive Decoding Improves Reasoning in Large Language Models
Paper • 2309.09117 • Published • 40 -
Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding
Paper • 2307.15337 • Published • 39