mkmohangb/video-editing.md

## video-editing.md

      
    Raw
  

              video-editing.md
            
          
Use the OpenAI Whisper API for transcription
It supports word level timestamps
if there are filler words then they are not transcribed, using prompts as well didn't help. But noticed that there is a gap in the timestamps which can be used for detection.
{'word': 'things', 'start': 25.799999237060547, 'end': 26.040000915527344}, 
{'word': 'So', 'start': 27.239999771118164, 'end': 27.799999237060547}, 
{'word': 'it', 'start': 27.799999237060547, 'end': 28.0}


In the above transcription, there is a gap of ~1 sec between things and So.


The select filter in ffmpeg can be used to remove sections of video like filler words. Relevant SO answer.
auto-editor video editing tool command line based.
ffmpeg-python - video editing in memory
Eddie smart video editor