This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ cat emma.txt | tr A-Z a-z | tr -sc "a-z0-9" "\n" | sort | uniq -c | sort -nr | head -5 | |
5242 to | |
5209 the | |
4898 and | |
4300 of | |
3192 i |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ cat emma.txt | tr A-Z a-z | tr -sc "a-z0-9" "\n" | sort | uniq | cut -c1 | uniq -c | sort -nr | head -5 | |
763 s | |
672 c | |
567 p | |
554 a | |
523 d | |
# note that first uniq is just collapsing repeating words so that each word is considered ONCE |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ cut -f3 movies | sort | uniq -c | sort -nr | head -5 | |
441 2002 | |
405 2000 | |
403 2001 | |
384 1998 | |
384 1996 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ cat ratings | cut -f4 | sort | uniq -c | sort -nr | head | |
432 01-03-1996 00:00:00 | |
63 26-07-2005 19:24:47 | |
44 28-03-1996 22:58:30 | |
42 30-03-1996 16:27:16 | |
42 27-03-1996 19:23:03 | |
42 16-04-1996 13:08:41 | |
42 15-04-1996 10:23:54 | |
42 14-04-1996 17:37:12 | |
42 14-04-1996 14:45:40 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ cut -f3 ratings | sort | uniq -c | |
94988 0.5 | |
384180 1 | |
118278 1.5 | |
790306 2 | |
370178 2.5 | |
2356676 3 | |
879764 3.5 | |
2875850 4 | |
585022 4.5 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ sort harfler | uniq -c | sort -nr | head -5 | sort -k2 | |
15 e | |
11 f | |
15 n | |
9 u | |
10 y |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ rev movies | cut -f1-2 | rev | sort | uniq -c | sort -nr | head | |
85 2002 Drama | |
84 2000 Drama | |
80 1998 Drama | |
80 1996 Drama | |
79 1999 Drama | |
78 2001 Drama | |
74 1995 Drama | |
68 1997 Drama | |
62 1994 Drama |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ cut -f2 Ecoli-cds-protein | cut -c1-3 | sort | uniq -c | sort -nr | |
3715 ATG | |
307 GTG | |
71 TTG | |
2 CTG | |
2 ATT |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Verifying that "alperyilmaz.id" is my Blockstack ID. https://onename.com/alperyilmaz |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ docker run --rm -v $(pwd)/data2:/data comics/bowtie2 wgsim -e 0 -N 10000 -1 300 -2 300 -r 0 -R 0 /data/s_cerevisiae.fa /data/sample_seq_1.fastq /data/sample_seq_2.fastq | |
[wgsim] seed = 1524818001 | |
[wgsim_core] calculating the total length of the reference sequence... | |
[wgsim_core] 18 sequences, total length: 12162995 | |
$ ls -alh data2 | |
total 25M | |
drwxr-xr-x 2 alper alper 4.0K Apr 27 11:32 . | |
drwxr-xr-x 4 alper alper 4.0K Apr 27 11:31 .. | |
-rw-r--r-- 1 root root 6.2M Apr 27 11:33 sample_seq_1.fastq |
OlderNewer