Using ffmpeg and avconv [Dec. 20, 2012, 2:07 p.m.]
Using ffmpeg and avconv
A hands on look at command line encoding. While this chapter was always going to be challenging to write, we think we have some good materials here which act as a base to look at the subject.
ffmpeg and avconv (a more updated version of ffmpeg) are the command line applications that are working in the back end of many desktop encoding applications. These include ffmpegX, Handbrake, SUPER encoder. It is also essential for many desktop video editing applications including Kdenlive.
These tools can also be a very handy to work with video on Internet servers. As you will see later in the chapter you will be able to do more than just transcode video from one file type to another.
WebM settings
For an overview of usefull ffmpeg/avconv options related to encoding WebM files. http://wiki.webmproject.org/ffmpeg and http://ffmpeg.org/ffmpeg.html#libvpx
http://rodrigopolo.com/ffmpeg/cheats.html
H264 settings
For a guide on creating h264 files that work on many mobile devices - http://h264.code-shop.com/trac/wiki/Encoding. For a general x264 encoding guide see - http://ffmpeg.org/trac/ffmpeg/wiki/x264EncodingGuide
ffmpeg and numpy
This guide written by RMO is a good introduction to taking your work with ffmpeg to another level on the server.
In the past year and a half, daf and I have undertaken a series of media experiments using python's excellent numpy library. The outcome of these trials are largely encapsulated in our numm project, which is available in the debian and ubuntu repositories as python-numm.
Numm uses gstreamer for a/v decoding and encoding, as well as a minimalist livecoding API, but in the interests of simplicity and portability, I've been reimplementing some of the core functionality as a wrapper around the ffmpeg binary.
loading a video as numpy arrays
import numpy as np import subprocess def video_frames(path, width=320, height=240, fps=30): cmd = ['ffmpeg', '-i', path, '-vf', 'scale=%d:%d'%(width,height), '-r', str(fps), '-an', '-c:v', 'rawvideo', '-f', 'rawvideo', '-pix_fmt', 'rgb24', '-'] p = subprocess.Popen(cmd, stdout=subprocess.PIPE) while True: arr = np.fromstring(p.stdout.read(width*height*3), dtype=np.uint8) if len(arr) == 0: p.wait() return yield arr.reshape((height, width, 3))
saving frames as images
Our video_frames function gives us numpy buffers from a video file. Each buffer is a 3-d array -- height, width, color -- with 8-bit intensity values. To save numpy buffers as images, we use the Python Imaging Library to define a np2image function (from image.py):
import Image def np2image(np, path): im = Image.fromstring( 'RGB', (np.shape[1], np.shape[0]), np.tostring()) im.save(path)
For example, to save a video as a directory full of still images:
for idx, fr in enumerate(video_to_frames(path)): np2image(fr, '%06d.jpg' % (idx))
encoding numpy arrays to video
Alternatively, we can write a series of frames back to disk as a video:
def frames_to_video(generator, path, fps=30, ffopts=[]): p = None for fr in generator: if p is None: cmd =['ffmpeg', '-y', '-s', '%dx%d' % (fr.shape[1], fr.shape[0]), '-r', str(fps), '-an', '-c:v', 'rawvideo', '-f', 'rawvideo', '-pix_fmt', 'rgb24', '-i', '-'] + ffopts + [path] p = subprocess.Popen(cmd, stdin=subprocess.PIPE) p.stdin.write(fr.tostring()) p.stdin.close() p.wait()
a simple test
Assuming you've saved these functions to a file (see here) and imported them, we can re-encode a video:
frames_to_video( video_to_frames('/path/to/input/video.avi'), '/path/to/output/video.webm')
And ffmpeg encoding parameters can be added as needed, though this is not an efficient or in any way recommended method to transcode videos.
frames_to_video( video_to_frames('/path/to/input/video.avi'), '/path/to/output/video.webm', ffopts=['-vb', '500K']) # &c.
video synopses
Here are a few quick examples of the processing you can do by thinking about video as a series of arrays.
composite image
The average frame in a video:
INPUT_VIDEO = '/path/to/video' W, H = (320, 240) comp = np.zeros((H, W, 3), dtype=int) nframes = 0 for fr in video_frames(INPUT_VIDEO, width=W, height=H): comp += fr nframes += 1 comp = (comp / nframes).astype(np.uint8) np2image(comp, INPUT_VIDEO + '-comp.png')
scans
Slitscans and 0xScans are pixel-wide sweeps through a video:
slits = [] oxscan = [] for fr in video_frames(INPUT_VIDEO, width=W, height=H): slits.append(fr[:,W/2]) oxscan.append(fr.mean(axis=1).astype(np.uint8)) slits = np.array(slits).transpose(1,0,2) oxscan = np.array(oxscan).transpose(1,0,2) np2image(slits, INPUT_VIDEO + '-slitscan.png') np2image(oxscan, INPUT_VIDEO + '-oxscan.png')
Assessment Task
Do something freaky (or normal for that matter) with ffmpeg / avconv and paste in the command line input you used as a comment or blog post.