Skip to content

Instantly share code, notes, and snippets.

@velovix
Last active April 11, 2024 15:26
Show Gist options
  • Save velovix/8cbb9bb7fe86a08fb5aa7909b2950259 to your computer and use it in GitHub Desktop.
Save velovix/8cbb9bb7fe86a08fb5aa7909b2950259 to your computer and use it in GitHub Desktop.
The text version of my GStreamer talk at sunhacks 2020

Introduction

Hi everyone! Today I'm going to be giving you a crash course in video processing using Python. Coming out of this talk, you'll be able to take video from pretty much any source, decode it, apply visual effects, and display it on-screen. To do this, we're going to be using a library named GStreamer, an incredibly powerful and versatile framework. This is the same tool that the pros use, but don't feel intimidated! GStreamer actually makes it very easy to do impressive things with video and you'll be well on your way to making something great in just the time it takes to watch this talk.

If you fall behind at any point during the live presentation, don't worry! I have a text version of this talk available with the same content and more. There should be a link in the description.

Installing Dependencies

Let's start by installing everything we'll need to start using GStreamer in Python. This is probably the hardest part, so if you managed to do it before this talk, it's all smooth sailing from here! If not, no worries! I'm going to go through how to install everything on Windows 10 right here. I would recommend opening up the text version of this talk, because I have links to the stuff we'll be downloading and you'll probably want to copy and paste a few of the long commands we'll be running. If you're using macOS or Linux, you'll find separate instructions for how to install everything for those platforms there as well.

Windows 10

We're going to be using a tool called MSYS2 to download everything we need to get started. MSYS2 makes it easy to set up development environments on Windows.

Download the latest stable release of MSYS2 from the releases page. At the time of writing, the latest release (2020-09-03) is available here. Then run the installer, accepting all the defaults, but unchecking "Run MSYS2".

Once it's installed, start "MSYS2 MinGW 64-bit" from the Start Menu. This will open up the MSYS2 terminal.

Let's get MSYS2 up-to-date by running the following command:

pacman -Syu

After this command finishes, it may need to close. Just open it right back up again!

Now, we're ready to install everything we need! The following command installs GStreamer, some plugins, Python, and the PyGObject library.

pacman -S mingw-w64-x86_64-gstreamer mingw-w64-x86_64-gst-devtools mingw-w64-x86_64-gst-plugins-{base,good,bad,ugly} mingw-w64-x86_64-python3 mingw-w64-x86_64-python3-gobject

Finally, you're going to need a text editor to write code! I use Visual Studio Code but you can use whatever you like. Even Notepad is fine!

macOS

Homebrew, a package manager for macOS, makes it easy to install everything we need for this project. To install Homebrew, simply run the following command in the terminal:

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"

Then, run this command to install everything we need:

brew install gstreamer gst-devtools gst-plugins-{base,good,bad,ugly} python@3 pygobject3

Ubuntu, Debian, elementary OS, Pop!_OS

Installing everything we need on Ubuntu and related operating systems is easy! Just run the following command in the terminal:

sudo apt install libgstreamer1.0-0 gstreamer1.0-plugins-{base,good,bad,ugly} gstreamer1.0-tools python3-gi gir1.2-gstreamer-1.0

Arch Linux, Manjaro

Like Ubuntu, installing everything on Arch Linux or Manjaro is just a matter of running the following command in the terminal:

sudo pacman -S gstreamer gst-plugins-{base,good,bad,ugly} python python-gobject

Digital Video Concepts

While you're waiting on everything to install, let's take a step back. A wise scholar once said: Before you decode the video, you must understand the video. You must be the video.

At a fundamental level, video is presented to viewers as a list of images that are shown one after the other at a high enough speed for our eyes to see it as a moving picture. Pretty simple, right? Well, there's just one problem. Storing all these thousands and thousands of images takes up a huge amount of space. An average 10 minute YouTube video would require over 100 GB of storage, and a feature-length movie could take upwards of a terabyte! Where is our escape from this madness??

Well, luckily, mathematicians and computer scientists have found many clever and sophisticated ways to compress video data down to a fraction of its original size. These researchers have turned their work into standards that define exactly how their technology works and how video data that is compressed this way can be decoded. We call these standards "video compression formats", and some popular ones include H.264, VP8, and AV1 among many others.

All the while, other smart people needed to figure out in what way the compressed video data should be saved to a video file, or split into chunks and streamed over the internet. This resulted in the development of special formats that hold both the compressed video data and additional information, like the title of the video, its resolution, and other stuff. We call these "container formats", and some popular ones include MPEG-4 and WebM.

So, in the end, you use a video camera to record something, those raw images get compressed in a video compression format, and once you're done your video is wrapped up with a nice little bow using a container format. Now, the video file is ready to be stored on your computer or streamed out for all the world to see.

GStreamer Concepts

Now that we know how video works, we can begin to understand how GStreamer lets us work with it. Working with GStreamer is kind of like creating an assembly line in a factory. Each step in the assembly line is in charge of doing one thing, and the results of one step are passed on to the next step until the process is complete. GStreamer calls this assembly line a "pipeline", and the steps are known as "elements".

Every pipeline starts with a source element, has some number of elements that process the data in the middle, and ends with a sink element. The source element is in charge of getting video data from somewhere, like a file on your computer or a video stream hosted online. That data is then passed to the next element, which does some processing on the data, and the result is passed on to the next element in the pipeline and so on. Finally, the fully processed data is passed to the sink element, which will take care of making the data available somewhere. That might involve saving it to your computer, hosting it as a live video stream, or passing it back to your application.

GStreamer has a lot of elements that do all kinds of different things. Each one has a name that we refer to it by, and certain rules governing what kinds of data it can take as input and what it produces as output.

Now, putting together one of these pipelines might sound hard, but GStreamer makes it pretty easy. All you have to do is give GStreamer a string with the names of each element you want in your pipeline, separated my exclamation marks. And that's it! GStreamer will take care of creating these elements and attaching them to each other.

Let's Get to the Code

With all that background in mind, let's jump into the code. Now, we're going to be writing in Python but the GStreamer library is written in C so we're going to be using what's called a "binding". A binding is simply a library that allows you to use a library in one language from another language. GStreamer's Python binding library is called PyGObject, and we import it like this:

import gi

Now we need to tell PyGObject the minimum version of GStreamer that our program requires, which the library shortens to "Gst". Once we've done that, we're ready to import the "Gst" module, as well as the "GLib" module which we will use shortly. Make sure to call Gst.init() to initialize GStreamer before doing anything else.

gi.require_version("Gst", "1.0")

from gi.repository import Gst, GLib


Gst.init()

After that, we need to start the main loop. The main loop is in charge of handling events and doing some other background tasks. Here we'll start it in a new thread, so that we can do other things in our main thread.

from threading import Thread

main_loop = GLib.MainLoop()
thread = Thread(target=main_loop.run)
thread.start()

Finally, we're ready to construct a simple pipeline! Like I mentioned earlier, all pipelines start with a source. Which source element we use will depend on where we want to get our video from. For now, let's try getting video from our webcam. On Windows, we can use the ksvideosrc element to do his. If you're on macOS, try autovideosrc. For Linux, it's v4l2src.

Then, we're going to follow that up with a decodebin element. This is a super helpful element that takes care of figuring out what container format and video compression format a source is providing, and handles decoding it for us into raw images.

Next let's add a videoconvert element, another handy tool that takes care of any format differences between the images that decodebin provides and what our next element expects.

Our pipeline is almost done! Just like how every pipeline starts with a source, they also end with a sink! Our sink of choice today will be autovideosink, which will display our webcam footage on-screen.

pipeline = Gst.parse_launch("ksvideosrc ! decodebin ! videoconvert ! autovideosink")

We've defined our pipeline, but we're not quite done yet! We still need to start the pipeline up. To do this, we use the set_state method, which asks the main loop that we started earlier to take care of initializing and playing our pipeline. All that work will be done in the background, so we can continue doing whatever we want in our program.

pipeline.set_state(Gst.State.PLAYING)

For this example, all we're going to do is wait around while our webcam footage is being played on-screen until you stop the program. At that point, we'll ask the pipeline to stop and clean up by setting it to the NULL state. Then, finally, we'll stop the main loop we started earlier.

try:
    while True:
        sleep(0.1)
except KeyboardInterrupt:
    pass

pipeline.set_state(Gst.State.NULL)
main_loop.quit()

Here's the example in full. Again, make sure to replace ksvideosrc with your platform's equivalent if you're not running on Windows.

from threading import Thread

import gi

gi.require_version("Gst", "1.0")

from gi.repository import Gst, GLib


Gst.init()

main_loop = GLib.MainLoop()
thread = Thread(target=main_loop.run)
thread.start()

pipeline = Gst.parse_launch("ksvideosrc ! decodebin ! videoconvert ! autovideosink")
pipeline.set_state(Gst.State.PLAYING)

try:
    while True:
        sleep(0.1)
except KeyboardInterrupt:
    pass

pipeline.set_state(Gst.State.NULL)
main_loop.quit()

Having Fun

Now that we've got our example application running, we can add some cool filters to our webcam stream! Some personal favorites of mine are edgetv and rippletv. Just make sure to add a videoconvert before and after them to ensure that the element is getting images in a format it's compatible with.

pipeline = Gst.parse_launch("ksvideosrc ! decodebin ! videoconvert ! edgetv ! "
                            "videoconvert ! autovideosink")

Doing Your Own Thing with Video

GStreamer has a huge number of fun and useful elements for just about everything, but what if you wanted to do something custom? Maybe you want to implement your own special filter, or send the images off to another service or library. For these cases, GStreamer provides the appsink element, which allows you to take data out of the pipeline. Let's check it out.

We're going to use our original webcam pipeline, but replace the autovideosink with appsink. We're going to give the element a name so that we can pull this element out of the pipeline and interact with it.

pipeline = Gst.parse_launch("ksvideosrc ! decodebin ! videoconvert ! "
                            "appsink name=sink")
appsink = pipeline.get_by_name("sink")

Now, we can pull images out of the appsink using the try_pull_sample method.

try:
    while True:
        sample = appsink.try_pull_sample(Gst.SECOND)
        if sample is None:
            continue
    
        print("Got a sample!")
except KeyboardInterrupt:
    pass

That's all I'm going to show you about appsink in this talk, just to whet your appetite. But, I have some more examples in the text version of this talk if this sounds like something your hack needs.

Conclusion

And.. that's a wrap! Thanks so much for listening, and I hope you found it enjoyable. If you have any questions, please feel free to reach out to me on the sunhacks Discord. Again, my name is Tyler and I should be marked as a "mentor". Happy hacking!

Extra Credit

My pipeline isn't working. How do I find out why?

When GStreamer encounters a problem, it prints an error message to the console. However, by default, these logs are hidden. To see them, we need to set the GST_DEBUG environment variable.

For example, if you're running your program like this:

python3 main.py

Run it like this, instead:

GST_DEBUG=2 python3 main.py

However, reading GStreamer logs can sometimes feel like an art form. Feel free to reach out to me if you're having trouble understanding what these logs are telling you!

@JaisonJHH
Copy link

JaisonJHH commented Jun 27, 2022

@JaisonJHH I'm not sure why that would be happening... Does it work if you use GLib.MainLoop.new() instead of GLib.MainLoop()? Either one should be equivalent. Can you tell me more about your environment?

The example may appear like it's working, but some key GStreamer functionality won't work unless the main loop is running properly.

@velovix Okay, I'll try with MainLoop.new(). Regarding environment I just followed your youtube video exactly in a STM32MP1 board.

By the way, I tried another method of implementing the same code and it worked without any issue:

import gi

gi.require_version('Gst', '1.0')
from gi.repository import Gst, GLib

Gst.init(None)

class Pipeline(object):
    def __init__(self):
        pipe_desc = ("rtspsrc location=rtsp://onviftest:onviftest@192.168.1.12:554/stream1 latency=0 buffer-mode=auto ! rtph264depay ! decodebin ! videoconvert ! autovideosink sync=false")

        self.pipeline = Gst.parse_launch(pipe_desc)


loop = GLib.MainLoop()

p = Pipeline()

try:
    p.pipeline.set_state(Gst.State.PLAYING)
    loop.run()

except KeyboardInterrupt:
    p.pipeline.set_state(Gst.State.NULL)
    loop.quit()

This works fine, give it a try. Also can you provide a basic boilerplate for using webrtcbin? Is it just kind of appsink for webrtc or can it actually create a peer?

@Suryansh1089
Copy link

I have to generate an RTSP link for my camera so that I can stream the feed on vlc from a remote location. How can it be done using GStreamer?
Please guide me here.
Thank you for your tutorials

@Suryansh1089
Copy link

@velovix I have to generate an RTSP link for my camera so that I can stream the feed on vlc from a remote location. How can it be done using GStreamer?
Please guide me here.
Thank you for your tutorials

@velovix
Copy link
Author

velovix commented Jul 19, 2022

@Suryansh1089 This is a big topic, but you're going to need gst-rtsp-server in order to do this. They have a few examples that show you how you can give the library a GStreamer pipeline, and it will take care of making the output of that pipeline available along with an RTSP URL.

@faridelya
Copy link

HI @velovix what about gstreamer python tutorial
we will be happy if you start these tutorials.

@velovix
Copy link
Author

velovix commented Feb 3, 2023

@faridelya Sorry, I'm not sure what you're asking. The official tutorials are great and I don't intend to replace them! I did this talk as part of a hackathon so that I could answer participants' questions while the event was ongoing.

@franhidalgocavs
Copy link

Hi, i need to do a stream from 3 x11 windows with 3 stream keys diferent and i need too do it with 0 ms delay, can i do it with gstremer comand line tool?
PD: sorry for how i speak english

@velovix
Copy link
Author

velovix commented Feb 21, 2023

@franhidalgocavs You may be able to use ximagesrc to accomplish this. You can set the xid parameter to select which window to use. You can then encode it in some format and stream it out using gst-rtsp-server. Achieving 0ms latency will be very difficult, though. Make sure to read up on the available parameters for whatever encoder you decide to use, as most of them have ways to tweak their behavior for real-time streaming.

@AhmedYasserrr
Copy link

AhmedYasserrr commented Mar 4, 2024

That was awesome, Thank you for sharing your knowledge and I hope if you can do more advanced videos about Gstreamer and other interesting topics about working with cameras, computer vision models, etc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment