Figure out what's going on with optical character recognition

If we'd use OCR on the video, and then use something like sumy to figure out what's going, we could for example detect 'chapters' in the video.