AI is employed in object detection to identify and locate objects within images or video. It utilizes deep learning techniques, such as convolutional neural networks (CNNs), to analyze visual data, detect objects of interest, and provide bounding box coordinates, enabling applications like autonomous driving, surveillance, and image recognition.
Researchers developed a vision-language model pipeline that generates dense, grounded captions for comic panels, improving accessibility and understanding for visually impaired individuals. Their approach annotated over 2 million comic panels, advancing computational comic analysis.
Researchers found that ordering UI elements for LM agents is crucial, with dimensionality reduction improving task success rates by over 50% in pixel-only environments.
Basler AG, international manufacturer of high-quality machine vision hardware and software, is expanding its proven pylon Software Suite with pylon AI. pylon AI are image analysis functions with artificial intelligence algorithms that, unlike conventional algorithms, can solve more complex vision tasks like classification and semantic segmentation.
Researchers explore how generative AI models like ChatGPT and DALL·E2 capture the unique identities of global cities through text and imagery, revealing both strengths and limitations in AI's understanding of urban environments.
Researchers developed the "Deepdive" dataset and benchmarked deep learning models to automate the classification of deep-sea biota in the Great Barrier Reef, achieving significant accuracy with the Inception-ResNet model.
This study compares four computer vision algorithms on a Raspberry Pi 4 platform for depalletizing applications. The analysis highlights pattern matching, SIFT, ORB, and Haar cascade methods, emphasizing low-cost, efficient object detection suitable for industrial and small-scale automation environments.
The novel SBDet model introduces a relaxed rotation-equivariant network (R2Net) that improves object detection in scenarios with symmetry-breaking or non-rigid transformations. This innovation offers greater accuracy and robustness in real-world visual tasks like autonomous driving and geosciences.
Researchers developed a deep learning model using the YOLOv5 algorithm to detect potholes in real-time, assisting visually impaired individuals. The model, integrated into a mobile app, achieved 82.7% accuracy, offering auditory or haptic feedback to enhance user safety.
Researchers introduced an advanced YOLO model combined with edge detection and image segmentation techniques to improve the detection of overlapping shoeprints in noisy environments. The study demonstrated significant enhancements in detection sensitivity and precision, although edge detection introduced challenges, leading to mixed results.
Researchers introduced a framework to evaluate machine learning (ML) model robustness using item response theory (IRT) to estimate instance difficulty. By simulating real-world noise and analyzing performance deviations, they developed a taxonomy categorizing ML techniques based on their resilience to noise and instance challenges, revealing specific vulnerabilities and strengths of various model families.
This paper explores advanced drowning prevention technologies that integrate embedded systems, artificial intelligence (AI), and the Internet of Things (IoT) to enhance real-time monitoring and response in swimming pools. By utilizing computer vision and deep learning for accurate situation identification and IoT for real-time alerts, these systems significantly improve rescue efficiency and reduce drowning incidents
An innovative AI-driven platform, HeinSight3.0, integrates computer vision to monitor and analyze liquid-liquid extraction processes in real-time. Utilizing machine learning for visual cues like liquid levels and turbidity, this system significantly optimizes LLE, paving the way for autonomous lab operations.
Researchers introduced RMS-DETR, a multi-scale feature enhanced detection transformer, to identify weeds in rice fields using UAV imagery. This innovative approach, designed to detect small, occluded, and densely distributed weeds, outperforms existing methods, offering precision agriculture solutions for better weed management and optimized rice production.
Researchers introduced a new method for 3D object detection using monocular cameras, improving spatial perception and addressing depth estimation challenges. Their depth-enhanced deep learning approach significantly outperformed existing methods, proving valuable for autonomous driving and other applications requiring precise 3D localization and recognition from single images.
Researchers have introduced Decomposed-DIG, a set of metrics to evaluate geographic biases in text-to-image generative models by separately assessing objects and backgrounds in generated images. The study reveals significant regional disparities, particularly in Africa, and proposes a new prompting strategy to improve background diversity.
Researchers introduced the Virtual Experience Toolkit (VET) in the journal Sensors, utilizing deep learning and computer vision for automated 3D scene virtualization in VR environments. VET employs advanced techniques like BundleFusion for reconstruction, semantic segmentation with O-CNN, and CAD retrieval via ScanNotate to enhance realism and immersion.
Researchers developed ORACLE, an advanced computer vision model utilizing YOLO architecture for automated bird detection and tracking from drone footage. Achieving a 91.89% mean average precision, ORACLE significantly enhances wildlife conservation by accurately identifying and monitoring avian species in dynamic environments.
Researchers have introduced the human behavior detection dataset (HBDset) for computer vision applications in emergency evacuations, focusing on vulnerable groups like the elderly and disabled.
Researchers in a recent Smart Agricultural Technology study demonstrated how integrating machine learning (ML) and AI vision into all-terrain vehicles (ATVs) revolutionizes precision agriculture. These technologies automate tasks such as planting and harvesting, enhancing decision-making, crop yield, and operational efficiency while addressing data privacy and scalability challenges.
A comprehensive review highlights the evolution of object-tracking methods, sensors, and datasets in computer vision, guiding developers in selecting optimal tools for diverse applications.
Terms
While we only use edited and approved content for Azthena
answers, it may on occasions provide incorrect responses.
Please confirm any data provided with the related suppliers or
authors. We do not provide medical advice, if you search for
medical information you must always consult a medical
professional before acting on any information provided.
Your questions, but not your email details will be shared with
OpenAI and retained for 30 days in accordance with their
privacy principles.
Please do not ask questions that use sensitive or confidential
information.
Read the full Terms & Conditions.