Skip to content

Instantly share code, notes, and snippets.

@bwhitman
Created January 2, 2015 01:33
Show Gist options
  • Star 7 You must be signed in to star a gist
  • Fork 4 You must be signed in to fork a gist
  • Save bwhitman/130c6290514fe4d877ff to your computer and use it in GitHub Desktop.
Save bwhitman/130c6290514fe4d877ff to your computer and use it in GitHub Desktop.
Attachment instructions for the MSD
The toughest part was getting access to an EC2 instance. I followed
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/concepts.html#access-ec2
To set up the aws command line interface, I followed
http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-welcome.html
-> http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-set-up.html
-> http://docs.aws.amazon.com/cli/latest/userguide/installing.html#install-bundle-other-os
(for my MacBook)
-> http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AccessingInstancesLinux.html
I don't recall how I set up the public/private key pair, but it wasn't
that hard.
Once I had a running default, mininum-cost default Ubuntu EC2 instance
running in the us-east-1 region ("N. Virginia"), I was able to use the
AWS web EC2 Dashboard to create a EBS instance from the Million Song
Dataset snapshot, snap-5178cf30, following the directions on:
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-public-data-sets.html#using-public-data-sets-launching-mounting
then attach it to my Ubuntu instance following:
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-attaching-volume.html
Then I ssh'd into the instance:
> ssh -i ~/Documents/aws/amazonec2.pem ubuntu@54.173.207.160
(where amazonec2.pem is the security certificate I created when setting up ec2)
Because I had already attached the snapshot, it was already there:
ubuntu@ip-172-30-0-62:~$ sudo file -s /dev/xvdf
/dev/xvdf: Linux rev 1.0 ext3 filesystem data,
UUID=21a8ff2f-0b14-46a8-8e69-62951b27dfd4 (large files)
So I just had to mount it:
ubuntu@ip-172-30-0-62:~$ sudo mkdir /mnt/snap
ubuntu@ip-172-30-0-62:~$ sudo mount -t ext4 /dev/xvdf /mnt/snap
ubuntu@ip-172-30-0-62:~$ ls /mnt/snap
AdditionalFiles data LICENSE lost+found README
ubuntu@ip-172-30-0-62:~$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/xvda1 7.8G 808M 6.6G 11% /
none 4.0K 0 4.0K 0% /sys/fs/cgroup
udev 492M 12K 492M 1% /dev
tmpfs 100M 328K 99M 1% /run
none 5.0M 0 5.0M 0% /run/lock
none 497M 0 497M 0% /run/shm
none 100M 0 100M 0% /run/user
/dev/xvdf 493G 272G 196G 59% /mnt/snap
The 493G partition at the end (which is only 272G used) is the MSD
data. You could scp it off that linux instance, but it probably makes
sense to run your processing on the EC2 instance itself.
DAn.
@andyyuan78
Copy link

do you know how to download it to your local PC disk?

@KimRasak
Copy link

KimRasak commented Mar 9, 2021

Hey dude, thank you so much.
Besides, be sure to switch your region to us-east-1 region, otherwise you won't find any match to the snapshot id.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment