Interested in becoming a Ramblr? Join us at the intersection of AI and spatial computing.
Research, Technology, and Projects
Internal Research and Technology
HD Mask Rendering
Rendering pixel-accurate semantic segmentation masks over arbitrary-length video in real-time web applications presents significant technical challenges in synchronization, performance, and memory management.
Coffee-Making 101 (for Robots)
Detailed and versatile annotations, structured activity and process understanding, integration with VLMs, and instruction generation enable and facilitate unparalleled opportunities for real-world robotic workflows and training.
Detector Training
Training our detection architectures on our full Coffee-Making 101 dataset and also a more specific subset, demonstrates that high-quality annotations result in high-accuracy object detectors.
Activity Prediction with XGBoost
Ramblr's XGBoost-base activity predictor combines a powerful video encoder with a lightweight boosted trees classifier, enabling fast, high-performance activity prediction for domain-specific applications.
GLobal OBject Embedder (GLOBE)
Ramblr's GLobal OBject Embedder (GLOBE) leverages binary masks to isolate and encode object regions, enabling robust global identification across time and transformations.
Research Projects
Show2Instruct
Envision AI systems that respond to spoken, context-specific queries – such as “Do all the windows in this room meet the requirements of the BIM specifications and accessibility standards?” – with real-time, visually grounded answers from an AI-powered device.
MuvAko
A forward-looking research initiative based in Saxony, Germany, and focused on leveraging generative AI and spatial computing to develop multimodal, context-aware assistance systems for virtual and mixed reality.
REACT
A fine-tuned MLLM designed to understand spatial and temporal context by analyzing video and sensor data. Our goal is to enable machines to follow and understand complex human instructions in dynamic environments.