Coding News and Research

RSS
AI Falters in Language Comprehension as Humans Maintain the Lead

AI Falters in Language Comprehension as Humans Maintain the Lead

Qwen2.5-Coder Redefines Coding AI With Scalable, High-Performance Models

Qwen2.5-Coder Redefines Coding AI With Scalable, High-Performance Models

Adaptive AI Agents Tackle Complex Tasks with Microsoft’s Magentic-One System

Adaptive AI Agents Tackle Complex Tasks with Microsoft’s Magentic-One System

Tencent’s Hunyuan-Large AI Model Sets New Benchmark with 389 Billion Parameters

Tencent’s Hunyuan-Large AI Model Sets New Benchmark with 389 Billion Parameters

Apple Researchers Challenge Large Language Models' Math Reasoning Capabilities with New Benchmark

Apple Researchers Challenge Large Language Models' Math Reasoning Capabilities with New Benchmark

OpenAI Advances AI Performance By Benchmarking Agents On Kaggle Competitions

OpenAI Advances AI Performance By Benchmarking Agents On Kaggle Competitions

AI Transforms Game Development: DreamGarden Grows Playable Worlds from a Single Prompt

AI Transforms Game Development: DreamGarden Grows Playable Worlds from a Single Prompt

Meta GenAI Boosts AI Learning with CGPO, Tackling Reward Hacking and Improving Multi-Task Performance

Meta GenAI Boosts AI Learning with CGPO, Tackling Reward Hacking and Improving Multi-Task Performance

NVIDIA's NVLM 1.0 Revolutionizes AI with Breakthrough Multimodal Performance

NVIDIA's NVLM 1.0 Revolutionizes AI with Breakthrough Multimodal Performance

Large Language Models in Astronomy Can Boost Research but Pose Ethical Risks

Large Language Models in Astronomy Can Boost Research but Pose Ethical Risks

ChatGPT Improves Software Security But Struggles With Complex Vulnerabilities

ChatGPT Improves Software Security But Struggles With Complex Vulnerabilities

AI Camera Traps With Continual Learning Boost Real-Time Wildlife Monitoring Accuracy

AI Camera Traps With Continual Learning Boost Real-Time Wildlife Monitoring Accuracy

Reviewing Drone Imagery for Infrastructure

Reviewing Drone Imagery for Infrastructure

Compressing CNNs Boosts Efficiency

Compressing CNNs Boosts Efficiency

CYBERSECEVAL 3 Security Benchmark Evaluates Risks in LLMs

CYBERSECEVAL 3 Security Benchmark Evaluates Risks in LLMs

Llama 3: Meta's New AI Model Rivals GPT-4

Llama 3: Meta's New AI Model Rivals GPT-4

GenSQL: Enhancing Probabilistic Database Queries

GenSQL: Enhancing Probabilistic Database Queries

Counterfactual Tasks Reveal Limits of Language Models

Counterfactual Tasks Reveal Limits of Language Models

Assessing the Impact of LLMs on Aviation Tasks

Assessing the Impact of LLMs on Aviation Tasks

LiveBench: A Dynamic Benchmark for Large Language Models

LiveBench: A Dynamic Benchmark for Large Language Models

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.