Skip to content

Instantly share code, notes, and snippets.

@chapmanjacobd
Created March 24, 2024 16:22
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save chapmanjacobd/3dced652b71a2f7cea084de23ff53812 to your computer and use it in GitHub Desktop.
Save chapmanjacobd/3dced652b71a2f7cea084de23ff53812 to your computer and use it in GitHub Desktop.
lb eda temp.db -L inf
$ lb eda temp.db -L inf
## temp.db:playlists
### Shape
(1, 8)
### Sample of rows
| | id | time_modified | time_deleted | extractor_config | extractor_key | path | time_created | hours_update_delay |
|----|------|-----------------|----------------|--------------------|-----------------|---------------|----------------|----------------------|
| 0 | 1 | 1700731270 | 0 | {} | Local | /home/xk/temp | 1700731269 | 70 |
### Summary statistics
| | id | time_modified | time_deleted | time_created | hours_update_delay |
|-------|------|-----------------|----------------|----------------|----------------------|
| count | 1 | 1 | 1 | 1 | 1 |
| mean | 1 | 1.70073e+09 | 0 | 1.70073e+09 | 70 |
| std | nan | nan | nan | nan | nan |
| min | 1 | 1.70073e+09 | 0 | 1.70073e+09 | 70 |
| 25% | 1 | 1.70073e+09 | 0 | 1.70073e+09 | 70 |
| 50% | 1 | 1.70073e+09 | 0 | 1.70073e+09 | 70 |
| 75% | 1 | 1.70073e+09 | 0 | 1.70073e+09 | 70 |
| max | 1 | 1.70073e+09 | 0 | 1.70073e+09 | 70 |
### Pandas columns with 'converted' dtypes
| column | original_dtype | converted_dtype |
|--------------------|------------------|-------------------|
| id | int64 | Int64 |
| time_modified | int64 | Int64 |
| time_deleted | int64 | Int64 |
| extractor_config | object | string |
| extractor_key | object | string |
| path | object | string |
| time_created | int64 | Int64 |
| hours_update_delay | int64 | Int64 |
### Missing values
0 nulls/NaNs (0.0% dataset values missing)
## temp.db:media
### Shape
(321042, 9)
### Sample of rows
| | id | playlist_id | path | size | type | time_created | time_modified | time_downloaded | time_deleted |
|--------|--------|---------------|--------------------------------------------|---------|----------------------|----------------|-----------------|-------------------|----------------|
| 0 | 112078 | 1 | /home/xk/temp/recup_dir.30/f14309395.wab | 858624 | Outlook address file | 1700707621 | 1700707621 | 1700731269 | 0 |
| 1 | 95604 | 1 | /home/xk/temp/recup_dir.270/f98048776.mp3 | 1058589 | audio/mpeg | 1700708064 | 1700708064 | 1700731269 | 0 |
| 2 | 282068 | 1 | /home/xk/temp/recup_dir.600/f345112288.txt | 963 | text/plain | 1700711574 | 1700711574 | 1700731269 | 0 |
| 321039 | 235815 | 1 | /home/xk/temp/recup_dir.518/f323333760.svg | 12339 | image/svg+xml | 1700711359 | 1700711359 | 1700731269 | 0 |
| 321040 | 296866 | 1 | /home/xk/temp/recup_dir.627/f378085696.elf | 12288 | ELF executable | 1700711750 | 1700711750 | 1700731269 | 0 |
| 321041 | 246015 | 1 | /home/xk/temp/recup_dir.536/f327458904.svg | 2067 | image/svg+xml | 1700711388 | 1700711388 | 1700731269 | 0 |
### Summary statistics
| | id | playlist_id | size | time_created | time_modified | time_downloaded | time_deleted |
|-------|----------|---------------|------------------|------------------|------------------|-------------------|----------------|
| count | 321042 | 321042 | 321042 | 321042 | 321042 | 321042 | 321042 |
| mean | 160522 | 1 | 290806 | 1.70071e+09 | 1.6515e+09 | 1.70073e+09 | 0 |
| std | 92677 | 0 | 1.12785e+07 | 1633.81 | 1.42946e+08 | 0 | 0 |
| min | 1 | 1 | 10 | 1.70071e+09 | -1.16445e+10 | 1.70073e+09 | 0 |
| 25% | 80261.2 | 1 | 667 | 1.70071e+09 | 1.70071e+09 | 1.70073e+09 | 0 |
| 50% | 160522 | 1 | 3072 | 1.70071e+09 | 1.70071e+09 | 1.70073e+09 | 0 |
| 75% | 240782 | 1 | 17920 | 1.70071e+09 | 1.70071e+09 | 1.70073e+09 | 0 |
| max | 321042 | 1 | 5.13411e+09 | 1.70072e+09 | 4.29372e+09 | 1.70073e+09 | 0 |
### Pandas columns with 'converted' dtypes
| column | original_dtype | converted_dtype |
|-----------------|------------------|-------------------|
| id | int64 | Int64 |
| playlist_id | int64 | Int64 |
| path | object | string |
| size | int64 | Int64 |
| type | object | string |
| time_created | int64 | Int64 |
| time_modified | int64 | Int64 |
| time_downloaded | int64 | Int64 |
| time_deleted | int64 | Int64 |
### Numerical columns
#### Bins
| id | count |
|--------------------------|---------|
| (-320.041, 53507.833] | 53507 |
| (53507.833, 107014.667] | 53507 |
| (107014.667, 160521.5] | 53507 |
| (160521.5, 214028.333] | 53507 |
| (214028.333, 267535.167] | 53507 |
| (267535.167, 321042.0] | 53507 |
| size | count |
|------------------------------|---------|
| (-5134095.551, 855684268.5] | 321037 |
| (855684268.5, 1711368527.0] | 2 |
| (1711368527.0, 2567052785.5] | 2 |
| (2567052785.5, 3422737044.0] | 0 |
| (3422737044.0, 4278421302.5] | 0 |
| (4278421302.5, 5134105561.0] | 1 |
| time_created | count |
|----------------------------------|---------|
| (1700707548.306, 1700709676.667] | 182918 |
| (1700709676.667, 1700711792.333] | 137629 |
| (1700711792.333, 1700713908.0] | 494 |
| (1700713908.0, 1700716023.667] | 0 |
| (1700716023.667, 1700718139.333] | 0 |
| (1700718139.333, 1700720255.0] | 1 |
| time_modified | count |
|-------------------------------------|---------|
| (-11660411790.827, -8988108462.167] | 8 |
| (-8988108462.167, -6331743324.333] | 0 |
| (-6331743324.333, -3675378186.5] | 0 |
| (-3675378186.5, -1019013048.667] | 0 |
| (-1019013048.667, 1637352089.167] | 48945 |
| (1637352089.167, 4293717227.0] | 272089 |
### Categorical columns
#### common values of type column
| type | Count | Percentage |
|---------------------------|---------|--------------|
| text/plain | 121950 | 37.9857 |
| Audition graphic filter | 31778 | 9.89839 |
| text/xml | 29625 | 9.22776 |
| image/png | 29112 | 9.06797 |
| ELF executable | 15611 | 4.8626 |
| image/svg+xml | 14959 | 4.65951 |
| image/jpeg | 8799 | 2.74076 |
| application/octet-stream | 7638 | 2.37913 |
| application/x-python-code | 6121 | 1.9066 |
| text/html | 5850 | 1.82219 |
| GZIP Archive file | 5508 | 1.71566 |
| audio/mpeg | 5395 | 1.68047 |
| XML Document | 4819 | 1.50105 |
| text/x-python | 4646 | 1.44716 |
| INI Config file | 4425 | 1.37832 |
| audio/x-wav | 2482 | 0.773108 |
#### High cardinality (many unique values)
- path
#### Low cardinality (many similar values)
- type
### Missing values
5,574 nulls/NaNs (0.2% dataset values missing)
#### 8 columns with no missing values
- id
- playlist_id
- path
- size
- time_created
- time_modified
- time_downloaded
- time_deleted
#### Value stats
| column | values | null | zero | empty_string |
|-----------------|-----------------|-------------|-----------------|----------------|
| id | 321042 (100.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| playlist_id | 321042 (100.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| path | 321042 (100.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| size | 321042 (100.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| time_created | 321042 (100.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| time_modified | 321042 (100.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| time_downloaded | 321042 (100.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| type | 315468 (98.3%) | 5574 (1.7%) | 0 (0.0%) | 0 (0.0%) |
| time_deleted | 0 (0.0%) | 0 (0.0%) | 321042 (100.0%) | 0 (0.0%) |
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment