Skip to content

Instantly share code, notes, and snippets.

@NotCompsky
Last active July 28, 2020 01:40
Show Gist options
  • Save NotCompsky/f1ab63fa2f191b156b9187b111449d20 to your computer and use it in GitHub Desktop.
Save NotCompsky/f1ab63fa2f191b156b9187b111449d20 to your computer and use it in GitHub Desktop.
tagem_blog

Hello welcome to my TedX talk, where I will read from my Medium blog.

The very beginning of this project starts with my attempts, sometime before summer 2018, at Lucid Dreaming. I had been having the best success when I was listening to music - the music playing with no earphones was a major hint that I was dreaming. I created a project which attempted to play my normal fall-asleep music, wait a couple of hours, and then play a sequence of unusual sounds, flash LED lights and activate a fan. I cannot find the first versions - which activated an external USB device that controlled the power to USB lights and a USB fan - only the final version of that project, which went down YouTube rabbitholes and used text-to-speech instead.

The problem was that I needed a way to control what was being played to me as I was sleeping. To go to sleep, I needed a solid playlist of relatively calm music. Simple enough. But for the later music - which needed to rouse my conscious mind from sleep, which I reasoned they'd become less good at the more frequently I played them - I needed vast quantity of music that was verified not to startle me awake, while also being stimulating enough to rouse my mind awake. This is where tagging came in.

That eventually gave rise to a project specifically about tagging files, not just for lucid dreaming, around summer 2018. It was another Python/BASH project, eventually called mytag, for the rapid tagging, deleting and moving of text files as a kind of ersatz note-taking system. That was back before I learnt SQL, when I was in fact terrified of it, so all metadata was saved into JSON files.

This tagger later expanded to work with other file types, and had a shell script to call other programs to display the contents of the file for certain types of files (PDF, HTML, spreadsheet, CSV, image, audio, video), to speed up tagging.

I then discovered that MPV Player - at the time my default video player - had a JSON API. I could then automate my script better, not needing to open and close new windows to view the next content - at least for images and video.

It merged with another Python project: a GUI (TKinter) utility - also from summer 2018 - for aiding the generation of computer vision datasets, using TKinter to allow the drawing and tagging of rectangles on images, using those rectangles to generate cropped subimages. Data was stored in both JSON and pickle formats.

The two databases were upgraded into a single hierarchical tagging database.

In early 2019, I learnt some C++, in order to make some (harmless) malware to shuffle a friend's home directories every time his computer booted up. Before that, I had only experienced C++ in the unavoidable Linux way of needing to occasionally edit a source file in order to get it to compile.

I took a liking to C++, and realised that - of course! - there must be some great C++ GUI libraries out there. (There actually aren't lol). In early 2019, this project was officially born, as I had decided to move the tagger to Qt, for more control of the UI (specifically key and mouse events). I think I entirely overlooked the fact that Python bindings existed for Qt. But that's a common theme in how I learn things - once I learn about a cool thing, I try to apply it to more places than it truthfully belongs.

I think the move to SQL occurred at the same time. However, it originally used CPPConn, MySQL's (official?) C++ connector library. I was developing rscraper alongside this project - rscraper being another example of applying a 'cool new thing' (in this case, C++) somewhere it doesn't really belong (web scraping). rscraper is the project in which I gradually made a wrapper for cppconn, then for MySQL's C API, which eventually spawned the (beautifully metaprogrammed) MySQL portion of libcompsky.

One of the first additions to this project was a Caffe2 pipeline, to automate even further what I had been doing in Python - as I had noticed that the Python library I was using was actually a wrapper around a C++ library, and using C++ myself I could do what I needed to do much more simply.

I paused my machine learning learning soon after, as I was burnt out from maths and wanted to avoid heavy theoretical maths. I found that it made for a great music player - and eventually I realised that I could transfer those playlists to my phone. I wrote a pipeline - a playlist file generator and a BASH script to transfer playlists and files to android - to generate and transfer playlist files recognised by VLC, with playlists specified by intersections of tags and scores.

However, VLC had several annoying issues - issues that were very annoying to me, but were probably ignored by most users. I have lots of music videos which I usually listen to as audio-only, and lots of WEBM audio files which VLC initially treats as video (until it finds no video stream). VLC stops playing a playlist in the background if a video is encountered (including a WEBM it doesn't recognise as audio-only). VLC makes it annoying to specify 'play as audio' - that option can only be selected while playing a video, so every time I wanted to play a mixed playlist of audio and video files all as audio, I had to cycle to a video file and select the option. If it only had WEBMs in, the WEBM would be treated as a video for a split second - not long enough to select the option - so I had to cycle through every WEBM file in order for VLC to recognise them all as audio-only. These issues were the reason for giving tagem a server.

The choice of C++ for the server was primarily because it is mostly reading/writing SQL and writing JSON, and I already had a C++ library that is - imo - easier to use in C++ than SQL libraries in Python or Go (the only other languages I have built servers in), and which I can add anything to if it makes this project tidier. Perhaps some of these features are in some packages in some other languages, but the time it would take to find these features far exceeds the time it takes to add them to my library. To its credit, Go was an amazingly easy language to write a server in, but since I had already got a working C++ server framework, that was not a big selling point.

The server was based on the server I had made for rscraper, and absorbed a previous project I had been working on, a Python (Flask) file sharing server. Although I had planned to build it on top of Proxygen, I went lower-level for efficiency, because I had previously developed a code generator - for an URL parsing project - of which libtriegen.py is part, that made it trivial to do so. In hindsight this massively increased the knowledge I had to have of the Facebook web stack due to bugs that appeared in fringe cases, so it might have been a mistake. But it's (almost) all good now.

I kept on adding features because HTML5 and modern Javascript make it surprisingly easy to.

Some further features were transplants from yet more projects. For instance, the qry language is evolved from a shell script I used for querying large scraped databases. It was easier to write in C++ than in shell script - the biggest headache is now not trusting user input.

Since qry was ported over, I integrated it with several other projects that had previously had their own tagging systems, such as two extremely simple scrapers for Twitter and Reddit.

Going forwards, I expect RScraper will also be integrated into this project's tagging system.

@rudluff
Copy link

rudluff commented Jul 28, 2020

You can get Android phones to behave normally with WEBM audio-only files by renaming them to end in .ogg. Nothing fancy, a simple mv works. WEBM audio liberated from YouTube is always going to be Opus, and Opus is allowed to end in either .opus or .ogg; most Android phones don't like the former extension but are perfectly happy to play Opus audio in an ogg container, lol.

(Not to imply that the project isn't Very Cool, it's just that this one specific thing stuck out at me.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment