Coding News and Research

RSS

DeepSeek’s NSA Outperforms Full Attention, Making AI Models Faster and Smarter

Researchers introduce Natively Sparse Attention (NSA), a novel sparse attention mechanism that accelerates long-context language models while maintaining high accuracy through trainable sparsity and hardware-optimized execution.

19 Feb 2025

MIT Study Shows Large Language Models Think in English Before Reasoning in Other Languages

MIT researchers discovered that large language models (LLMs) process diverse data through a central "semantic hub," similar to the human brain, improving our understanding of AI reasoning.

19 Feb 2025

Why People Trust AI for Music Picks but Not for Medical Decisions

A new study finds that people trust AI for low-stakes decisions like music recommendations but are skeptical in high-stakes areas like healthcare—especially those with strong statistical literacy.

17 Feb 2025

Students Build Beginner-Friendly Machine Learning Guide to Combat Antibiotic Resistance

San Francisco State University researchers created a step-by-step machine-learning tutorial to predict antibiotic resistance, making complex concepts accessible to beginners in biology and health sciences.

11 Feb 2025

AI Safety Showdown: DeepSeek-R1 Falls Behind OpenAI’s o3-mini

DeepSeek-R1 is nearly ten times less safe than OpenAI’s o3-mini, with higher rates of unsafe responses, particularly in financial crime, violence, and hate speech categories. OpenAI’s policy safeguards play a crucial role in preventing harmful outputs.

2 Feb 2025

AI Model EpiBERT Predicts Gene Expression by Unlocking DNA’s Hidden Regulatory Code

Scientists have developed EpiBERT, an AI model that predicts gene expression by decoding chromatin accessibility and regulatory grammar across human cell types.

29 Jan 2025

Embodied AI Model Learns Like a Toddler, Revealing Secrets of Cognitive Development

Researchers at OIST have developed an embodied intelligence model that mimics how children learn to generalize by combining sensory inputs and language, offering insights into cognitive development and AI transparency.

22 Jan 2025

DeepSeek-V3 Sets New Standards in Open-Source AI Development

DeepSeek researchers unveil DeepSeek-V3, a 671B parameter open-source language model with state-of-the-art performance, achieved through innovative architectures and cost-effective training. This milestone rivals closed-source giants like GPT-4o while setting efficiency benchmarks in AI development.

8 Jan 2025

Hour of Code Activities Fall Short in Comprehensive AI Education

Study reveals that while Hour of Code activities excel in introducing AI basics, they often lack depth, hands-on creativity, and critical engagement needed for a well-rounded understanding of artificial intelligence

5 Jan 2025

Phi-4: Microsoft Research's 14B Parameter Model Advances STEM Reasoning

Phi-4, a 14-billion parameter model by Microsoft Research, leverages synthetic data and innovative training to excel in STEM reasoning and coding tasks, outperforming larger models like GPT-4o.

5 Jan 2025

LG's EXAONE 3.5 Sets New Standards in Generative AI

Researchers at LG AI Research unveiled EXAONE 3.5, a series of instruction-tuned large language models that excel in long-context comprehension, bilingual capabilities, and competitive benchmarks.

18 Dec 2024

How AI Is (and Isn’t) Changing Slow Journalism

Researchers at Complutense University of Madrid explored AI's role in slow journalism, revealing limited adoption due to skepticism over creativity, ethics, and quality.

10 Dec 2024

TÜLU 3 Pushes the Boundaries of AI Post-Training Excellence

Researchers at Allen AI introduced TÜLU 3, an open-source framework for refining language models with advanced post-training techniques like RLVR, achieving superior performance over proprietary models in specific tasks and benchmarks. The release includes datasets, recipes, and evaluation tools to advance open AI research.

2 Dec 2024

Logic Training Transforms AI Into Smarter Problem-Solver

Researchers propose Additional Logic Training (ALT) to enhance reasoning in large language models using a robust, synthetic corpus, leading to significant performance boosts across logic, math, coding, and natural language tasks.

27 Nov 2024

AI Falters in Language Comprehension as Humans Maintain the Lead

Researchers tested seven advanced language models on a new comprehension benchmark and found they performed at chance accuracy, with inconsistent and non-human-like errors, while humans consistently outperformed them.

19 Nov 2024

Qwen2.5-Coder Redefines Coding AI With Scalable, High-Performance Models

Researchers unveiled Qwen2.5-Coder, a cutting-edge series of code-generation models outperforming larger competitors on key benchmarks, redefining coding intelligence. The series showcases exceptional scalability, long-context handling, and multilingual capabilities.

17 Nov 2024

Adaptive AI Agents Tackle Complex Tasks with Microsoft’s Magentic-One System

Microsoft's Magentic-One introduces a multi-agent AI system, coordinated by an Orchestrator, that autonomously handles complex tasks. Rigorous evaluation on diverse benchmarks showcases its adaptable and secure task-solving approach.

13 Nov 2024

Tencent’s Hunyuan-Large AI Model Sets New Benchmark with 389 Billion Parameters

Hunyuan-Large, Tencent’s largest open-source Transformer-based mixture of experts (MoE) model, pushes the boundaries of AI with 389 billion parameters and 52 billion activated experts, excelling in tasks like reasoning, coding, and long-context processing. It outperforms leading models like LLama3.1, demonstrating superior scalability and efficiency.

11 Nov 2024

Apple Researchers Challenge Large Language Models' Math Reasoning Capabilities with New Benchmark

Apple researchers introduced GSM-Symbolic, a new benchmark to reveal the weaknesses in large language models' mathematical reasoning, showing that they rely heavily on pattern-matching rather than genuine logic.

21 Oct 2024

OpenAI Advances AI Performance By Benchmarking Agents On Kaggle Competitions

OpenAI's MLE-bench evaluates AI agents on machine learning engineering tasks using Kaggle competitions, revealing promising performance in nearly 17% of trials. The benchmark is open-sourced to boost research on autonomous ML engineering.

15 Oct 2024