Arte Museum

Agency: D’strict
Role: Technical Artist / Interaction + Tracking Dev
Location: Yeosu, S.Korea 2021, Gangneung S.Korea 2022, Chungdu China 2023, Dubai UAE 2024
Tech/Dev: Unity/HDRP->Unreal5 + Real-time Object Detection(YOLO) + ZMQ networking

Arte Museum is Korea’s largest immersive media art exhibition organized by a world-class digital design company d’strict, which is also well-known for the public media art ‘WAVE', presented at Coex.

As Dev Team Leader, I am in charge of system network of 64 computers and ~100 projectors and overviewed development of media players and interactive installation of 11 attractions through out 3 zones.

 

 

Teabar

Concept:
In this immersive installation, a table is projected with a serene nighttime lake scene, illuminated only by the soft glow of a low moon. As a cup of tea is placed on the table, a magical transformation takes place. The moon suddenly appears atop the cup, casting its light onto the surroundings, revealing the shimmering water and the intricate branches. The moonlight emitted from the tea cup creates a breathtaking spectacle as it blossoms flowers all around, immersing viewers in a captivating and enchanting experience.

This post will mostly about technical apprach.

There are two significant takeaways in this project:

  1. Efficient Flower Animation Rendering/Controlling: Successfully rendered 5000 independent flower animations, each comprising 10 million triangles, with only 645 draw batches. The project also involved calculating the distances between all 5k flowers and cup. This was made possible by leveraging Vertex Animation Texture (VAT) for animation and Unity's Data-Oriented Technology Stack, an entity system that made cpu to work like gpu to determine distances between numerous cups and 5k flowers in each frame. These optimizations resulted in smooth playback at 60+ frames per second in 1080p resolution and approximately 55 frames per second in 4K resolution. Additionally, the project demonstrated a significantly reduced scalability impact when increasing the number of cups, surpassing the limitations of traditional object-oriented programming (OOP) comparisons.

  2. Tea cpu tracking system with minimal/stable resource use: Implemented cutting-edge machine learning techniques, specifically utilizing YOLOv4 with Darknet, for cup tracking. This advanced tracking system prioritized steady memory management, effectively utilizing RAM and VRAM resources to ensure stability and optimize performance. The CPU usage was kept at an efficient 10% of an i7-11700K CPU, while GPU usage remained limited to 10% of a 3080Ti, with a mere 1.4GB of VRAM consumption. These measures ensured efficient and effective cup tracking while minimizing resource requirements.


These approaches resulted in the project's ability to efficiently handle complex flower animations and utilize machine learning for precise cup tracking while maintaining high-performance standards and resource optimization.

 

 
 

Documents below will talk about other hassles: modulizing softwares, handling camera distortion and training models to track cups.

  1. Modulizing parts of software via network.

    • When developing interactive applications, I often had to rely on third-party software development kits (SDKs) such as Realsense, Kinect, or OpenCV. However, most of these SDKs were not originally implemented in C# or designed for non-game engines enviroments. Some of them were ported and provided for game engines, but they were not well-maintained and often introduced unknown bugs that were difficult to fix due to limited access to the source code. Another limitation was that these SDKs were implemented in the main game thread, which became a performance bottleneck.

    • Additionally, recovering from errors caused by these third-party SDKs was challenging, often requiring a complete restart of the application and resulting in an unappealing user experience. To address these issues and leverage machine learning (ML) frameworks effectively, I wanted to utilize C++ for its speed, stability, and low resource consumption compared to typical Python applications. Python is a useful language, but when it comes to ML tasks that require high computational power, its performance can be limiting.

    • For my implementation, I decided to use Unity to handle tracked information and Darknet for real-time object detection. To facilitate future development and ensure reliable communication, I utilized ZeroMQ (ZMQ) as a networking solution. This network scheme allowed me to achieve fast and stable ML execution with low CPU/GPU usage and consistent memory consumption.

 
 

Pitcture above is what I call Proxy server: it relays meesage within local host based on topic. App can subscribe to topic for publish msg with topic

 
 

Picture below is what I call Router server: it relays message from computer to computer based on ip. Message can be connect to proxy and relay msg to according subscriber

 
 
 

2. Tracking tea cup position

  • Initially, we employed depth sensors like Kinect and RealSense for our camera setup. However, these cameras had wide angles and significant distortion. Due to wide angle angle lense we had to place them 1.6m above the table and it was not aesthetically pleasing. Additionally, there were issues with the Kinect Unity SDK freezing unexpectedly, and using USB connections required extensions with power, which often malfunctioned. But they did provide built in infrad imaging for tracking since we were projecting images on top of the tea.

  • Initial tracking method was to use OpenCV’s circle tracking. the OpenCV circle search operation was CPU intensive, limiting our ability to achieve better visuals.

  • To overcome these limitations, I opted for a Power over Ethernet (PoE) camera, which had a reliable connection with no failures encountered. I selected a 6mm lens with 820nm IR long pass filters and installed an IR light at 900nm, which is invisible to the human eye (as we can see a red dot if the light is around 800nm). The lens I purchased had a distortion rate of 2% and was reasonably priced. I used Calib.io's calibration tool to calculate the lens distortion and applied a remapping technique to correct the distortion.

  • For tracking, I initially experimented with YOLOv4 and later upgraded to YOLOv7. I conducted numerous tests and trials using YOLO and YOLO-tiny with custom configurations, ultimately achieving satisfactory results after approximately 2 hours of training.

 
 

left is distortion map of 6mm lense placed 3m above the desk, right is distortion map of realsense place 1.6m above the desk

 
 
 

3. Training Cups

Images of ~80 was good enough to track the cups, I have used yolov4-tiny and yolov7-tiny to train them which took around 1.5 hours to train. I used Darknet which is built with c++ because I wanted to use opencv with CUDA and be efficient(fast) as possible.

 
 
 

4. The TeaTracker

TeaTracker is imgui based GUI that connectes to various cameras(Kinect, RealSense, LucidCam) and run yolo to detect objects.

1) Dected Video: shows images and detection real time
2) Status: shows current status of detections and sw
3) Static Config: shows loaded configuration mostly for camera and yolo dataset.
4) Threads: TeaTracker is multi threaded application to make it fast as possible. This section controls fps of each thread to limit cpu usage. I around ~45fps was good enough for interaction and cpu usages was < 10%, awesome.
5) Dynamic Config: Used to set up table boundaries, offsets, camera exposures, network data type and tracker info.

 
TeaTracker SW based on darknet. GUI made by ImGui

TeaTracker Debugger GUI made based on Darknet.

 
 

5. The Visual

As described in concepts, moon appears on center of the cup. Moons causes flowers to bloom!

I need more videos for this :p