Beyond Vision Logo

Project Results & Achievements

Object Detection

91.3%

mAP@0.5 accuracy in campus navigation

ATM Interface

89.4%

mAP@0.5 in button detection

OCR Accuracy

94.8%

Text recognition accuracy

VQA Score

4.3/5

Human relevance rating

System Training & Validation

YOLOv8 Training Metrics

  • 50 epochs training duration
  • Early stopping for optimal convergence
  • mAP@0.5 and mAP@0.5:0.95 metrics
  • Custom dataset of 4,202 images

VLM Performance

  • CogVLM outperformed BLIP-2 in context
  • Inference-only mode deployment
  • High relevance in free-form queries
  • Real-time response capabilities

Comparison with State-of-the-Art

Our System

  • • 91.3% obstacle detection
  • • 89.4% ATM interface accuracy
  • • VQA integration
  • • Real-time processing

Zhang et al.

  • • 85% obstacle accuracy
  • • No VQA capabilities
  • • Limited context awareness
  • • Smartphone-only

Johnson et al.

  • • 90% button accuracy
  • • Limited robustness
  • • No contextual understanding
  • • Fixed lighting conditions

Key Takeaways & Future Work

Achievements

  • State-of-the-art detection accuracy
  • Successful VLM integration
  • Reliable OCR performance
  • Modular system architecture

Future Directions

  • Dynamic lighting adaptation
  • Enhanced offline capabilities
  • Indoor location mapping
  • User customization options