- Work To Be Done.
- Work In Progress.
- Work Completed.
- Set-up a blog.
(https://saurabhshri.github.io) - Write a blog post about getting accepted into GSoC. 🔥
(https://saurabhshri.github.io/2017/05/gsoc/accepted-in-google-summer-of-code-2017) - Begin creating a Github Gist to host milestones and deliverables checklists.
(https://gist.github.com/saurabhshri/05f662806a23243bc049c4676c904233) - Create a centralized GSoC Page
(https://saurabhshri.github.io/gsoc) -
Create accounts across CCExtractor Systems.[POST GSoC Task] - Set-up server and development environment.
(Working on my laptop and gsocdev3 server) - Fine tune the deliverables and update this Gist.
- Shamelessly insert this to tell everyone that it's my birtday on 24th. 🎂
(May The Twenty FOurth Be With You) - Collect samples and create sample repository.
- Post the update on blog.
(https://saurabhshri.github.io/2017/05/gsoc/gsoc-2017-coding-period-begins)
- Design basic workflow.
- Create basic design scheme and testing plan.
- Begin coding officially.
- Post the update on blog.
(Covered in the previous blog post)
- Write a tool to extract only important and required information from subtitle file.
(https://github.com/saurabhshri/simple-yet-powerful-srt-subtitle-parser-cpp) - Write a difference showing script to compare the outputs generated by various proposed architecture.
- Start building basic skeleton of tool.
(https://github.com/saurabhshri/CCAligner) - Post the update on blog.
(https://saurabhshri.github.io/2017/06/gsoc/google-summer-of-code-week-1-the-beginning)
- Write the implementation of approximation based word tagging.
(https://github.com/saurabhshri/CCAligner/blob/master/src/lib_ccaligner/generate_approx_timestamp.cpp) - Test the implementation against sample repository.
(https://www.youtube.com/watch?v=km1iHe_mGuo) - Make a report on its accuracy.
- Document the use case scenarios.
- Post the update on blog.
(https://saurabhshri.github.io/2017/06/gsoc/google-summer-of-code-week-2-valar-researchis)
Work load has been kept less in the beginning two week so that I can practice other crucial techniques such as adding test cases and understanding code review work flow by mentors.
- Implement ability to read audio file.
(https://github.com/saurabhshri/CCAligner/blob/master/src/lib_ccaligner/read_wav_file.cpp) - Begin implementation of VAD.
(https://github.com/saurabhshri/CCAligner/tree/master/demo/VAD) - Add various output options (SRT/XML/JSON..)
- Experiment processing video samples after cutting them into small chunks based on silence zones.
- Post the update on blog.
(https://saurabhshri.github.io/2017/06/gsoc/google-summer-of-code-week-3-printusage)
- Read wave file from stream/pipe.
(https://github.com/saurabhshri/CCAligner/commit/8e2cad80ec8a77baf83a13f8c4b511ad880c347a) - Meet missed milestones.
- Fix bugs and improve performance.
- Prepare report for first evaluation.
- Test against sample repository.
- Post the update on blog.
- Tool for subtitle processing and basic testing architecture.
- Sample repository.
- Algorithmic and Probability based word - audio matching.
- VAD implementation.
- Post the update on blog.
- Test and experiment with different ASR.
- Start full blown ASR work.
- Begin word detection.
- Creating and fine tuning acoustic models.
- Continue completing remaining work on word detection.
- Creating and fine tuning language models and dictionaries based on subtitles.
- Creating FSGs to direct ASR to restrict recognition to specific words.
- Post the update on blog.
(https://saurabhshri.github.io/2017/07/gsoc/google-summer-of-code-week-5-6-what-d-you-say)
- Write code for intelligently assigning timestamps on basis of frames and also probability.
- Try different setting and combinations to achieve maximum accuracy.
- Create a logic based on fuzzy search that shall look for words approximately ahead and behind the set domain.
- Create custom dictionaries on the fly.
- Fix bugs and complete missed milestones.
- Work on improving functionality, speed and accuracy.
- Look into ways of improving the recognition (like segmentation, noise reduction).
- Start preparing for Midterm Evaluations.
- Post the update on blog.
(https://saurabhshri.github.io/2017/08/gsoc/google-summer-of-code-week-7-8-let-s-karaoke)
- Word recognition and timed transcription.
- Tuned language models and dictionaries.
- Adaptation script and implementation for custom models and dictionaries based on subtitles.
- Exporting result with colored identification of recognised/ non recognised words.
(Just for demonstration purpose) - Post the update on blog. (https://saurabhshri.github.io/2017/08/gsoc/google-summer-of-code-the-mid-term-evaluations)
- Begin phoneme detection.
- Creating and fine tuning phonetic language models and dictionaries based on subtitles.
- Write code for intelligently assigning timestamps on basis of frames and also probability.
- Post the update on blog.
- Work on various output formats such as XML, JSON, text to store and dump data in .
- Clean up the code. Fix bugs, memory leaks.
- Logging and error handling.
- Work on missing documentation and missed milestones.
- Prepare submission reports.
- Post the update on blog.
- Tool capable of word by word audio - subtitle synchronisation.
- API for for word by word audio - subtitle synchronisation.
- Complete project documentation.
- Finalisation and completion of project.
- Project submision.
- Post the update on blog.