MediaPipe Computer Vision on Arm Virtual Hardware

How to download and run several ML examples and run them on the virtual Raspberry Pi 4b board available from Arm Virtual Hardware (AVH).
MediaPipe Computer Vision on Arm Virtual Hardware

MediaPipe is an open source, cross-platform, customizable ML solution for live and streaming media.  It’s the ML engine inside a number of leading products from Google, including YouTube and Google Photos.  It’s a great building block to create your own ML solutions and you can learm more about it and download the source code here.   

In addition to being a great starting point for your own ML development, MediaPipe also contains a number of great examples which can showcase the capabilities of a number of platforms. In this blog, we will go through the steps to download and run a number of very cool ML examples and run them on the virtual Raspberry Pi 4b board available from Arm Virtual Hardware (AVH).  You don’t need to know anything about coding or ML to run any of these examples.

Virtual Raspberry Pi 4 boards running on AVH can use your computer’s webcam and microphone just as you’d use sensors on a physical board. This compatibility allows computer vision developers to rapidly build and test MediaPipe Solutions AI/ML models without the need for a physical Raspberry Pi board.

ezgif-7-4590a0d71c

Which MediaPipe Example Tasks are Supported?

Currently, six of the AI/ML examples are compatible with the Raspberry Pi:

  1. Face Landmark Detection
  2. Gesture Recognition
  3. Image Classification
  4. Audio Classification
  5. Text Classification
  6. Object Detection

You can learn more about these example tasks in the MediaPipe Solutions guide.

Configure Your Virtual Raspberry Pi

  1. Create a virtual Raspberry Pi 4-B board running Raspberry Pi OS Desktop. (Refer to our Quickstart for Raspberry Pi 4 guide for more details.)

    Picture3-Feb-29-2024-05-27-50-2722-PM

  2. When the device boots, close the Welcome screen and SSH dialog.

    Picture4-Feb-29-2024-06-25-12-0827-PM

  3. Open the Sensors tab and enable the Camera and Microphone sensors. (You will need to enable permissions in your browser settings too.)

    Picture5-Feb-29-2024-06-25-48-0333-PM

  4. Open the Console tab and log in with the default credentials (pi/raspberry). You should see a shell prompt.

    Picture6-Feb-29-2024-06-26-22-2300-PM

  5. Set your console to use the default local display. In the Console tab, run the following command: 

    export DISPLAY=:0

    Picture7-Feb-29-2024-06-26-56-8784-PM

  6. Install the MediaPipe PIP package and clone the repository. In the Console tab, run the following commands: 

    python -m pip install mediapipe

    git clone https://github.com/googlesamples/mediapipe

Picture8-4

Run the MediaPipe Computer Vision Example Tasks

  1. Run the gesture recognition example task, which will track the position of your palm, fingers and knuckles. When you are finished, press Ctrl+C in the Console tab to exit. 

    cd mediapipe/examples/gesture_recognizer/raspberry_pi

    sh setup.sh && python3 recognize.py --numHands=2

    Picture9-3

  2. Then try the gesture recognition example task, which will track the position of your palm, fingers and knuckles. When you are finished, press Ctrl+C in the Console tab to exit. 

    cd ../../gesture_recognizer/raspberry_pi

    sh setup.sh && python3 detect.py

    Picture10-2

  3. Finally, run the image classification example task. When you are finished, press Ctrl+C in the Console tab to exit. 

    cd ../../gesture_recognizer/raspberry_pi
    sh setup.sh && python3 classify.py

  4. When you are finished, shut down your virtual Raspberry Pi board.

Summary 

Leveraging MediaPipe Solutions on AVH allows you to rapidly train and test AI/ML models, including computer vision, on virtual Raspberry Pi boards using only a local workstation and integrated webcam.

AVH reduces the hardware constraints placed on AI/ML models and is an ideal tool for agile developers working on AI/ML based computer vision.