Lumen is an independent 3rd party research project studying cease and desist letters concerning online content. Lumen collects and analyzes requests to remove material from the web. The main goals of Lumen are to educate people, to facilitate research about the different kinds of complaints and requests for removal--both legitimate and questionable--that are being sent to Internet publishers and service providers. Lumendatabase contains millions of notices.
A statistics dashboard is something, that is a must have feature for Lumendatabase. This dashboard gives a quick overview of the number of links that were added, the statistics of who added them, the links added in a particular time frame, etc. This dashboard would particularly be useful for summarization of various statistics. The data collected would also be useful for future plans to introduce some type of recommendation system in lumendatabase. The second part of the project was creation of web archives using either perma.cc or creating a custom archival mechanism.
- Total number of notices
- Total number of URLS
- Notices by sender, receiver, and submitter
- Notices involving a particular domain
- Visitors by country
- Number of URLs/entity
- Word cloud from notice texts
- Total number of unique entities
- Web archival with perma / custom implementation
- Discussing and finalizing product requirements
- Designing of MOCK UI and overall architecture
- Link to the MOCK UI : https://drive.google.com/file/d/1F9gg5jcPE_GlqY9kDtNtBnZ_aN8zRp2i/view
- Designing to cater the scale and amount of data that Lumendatabase handles
- Mock UI ( https://drive.google.com/file/d/1F9gg5jcPE_GlqY9kDtNtBnZ_aN8zRp2i/view )
- berkmancenter/lumendatabase#546
- berkmancenter/lumendatabase#547
- berkmancenter/lumendatabase#553
- berkmancenter/lumendatabase#554
- berkmancenter/lumendatabase#558
- berkmancenter/lumendatabase#560 (Merged through berkmancenter/lumendatabase#566 )
- berkmancenter/lumendatabase#568
- berkmancenter/lumendatabase#567
- berkmancenter/lumendatabase#562
- berkmancenter/lumendatabase#559
- berkmancenter/lumendatabase#572
- Addressing review comments for open PRs till they get merged
- Web archival with perma / custom implementation
- Defining user access restrictions of the dashboard
- Designing a feature / product end to end, right from getting requirements to testing of the product
- Unique and varied features of Postgres database
- Handling large amounts of data and processing it efficiently
- Rails conventions, code quality, working with elastic-search rails
- Important of unit and integration tests