Skip to content

Instantly share code, notes, and snippets.

@anteaya
Created February 22, 2013 23:56
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
Star You must be signed in to star a gist
Save anteaya/5017511 to your computer and use it in GitHub Desktop.
swift and ceph
<anteaya> just one question
<anteaya> That method will call GETorHEAD which in
<anteaya> do you think it would look better if it was GET or HEAD
<anteaya> rather that GETorHEAD as it is now? GETorHEAD looks like one message rather than two
* lloydde (~lloydde@hq-r.pistoncloud.com) has joined #openstack-101
* torgomatic (~torgomati@meat.andcheese.org) has joined #openstack-101
* dolphm (~dolphm@cpe-67-10-140-29.satx.res.rr.com) has joined #openstack-101
<notmyname> anteaya: ok, pushed those fixes. also, GETorHEAD is correct (https://github.com/openstack/swift/blob/master/swift/proxy/controllers/obj.py#L328)
<notmyname> GETs and HEADs, per the HTTP spec, are the same except for the existence of a response body, so they are implemented in swift as the same function (rather than duplicating the code)
<anteaya> nice
<anteaya> thank you, I learned something
<anteaya> notmyname, actually have you a further moment, I have a question
<anteaya> I am an intern with OpenStack and will be talking to a company next week
<anteaya> and they use both swift and ceph
<anteaya> I think if I knew the difference I would feel more confident, I don't know the difference now
<notmyname> anteaya: the biggest difference is that ceph was originally designed to be used for a distributed filesystem (although that's the part they don't have ready yet) and swift was designed for object storage. ceph implements an object storage layer on which it builds the rest. the big difference is that ceph chooses consistency over availability and swift chooses availability over consistency
<notmyname> in this case, "consistency" means that the whole system knows about the current state of the data
<anteaya> ah ha
<notmyname> in the case of hardware failures, all distributed systems must choose either consistency or availability
<notmyname> the different choices ceph and swift have made lend themselves to different use cases
<anteaya> thank you, exactly what I was looking for: http://www.mirantis.com/blog/object-storage-openstack-cloud-swift-ceph/ doesn't make that point
<anteaya> good to know, what would be examples of the use cases for each?
<notmyname> swift is good for unstructured data with a high degree of concurrent access (user data, web content, mobile backends, storage providers, etc)
<anteaya> right
<notmyname> ceph is good for use cases where strong consistency is important (like block devices for VMs on which you create filesystems)
* dolphm has quit (Remote host closed the connection)
<anteaya> okay
<notmyname> and this is pretty much where my explanation of the differences stops so that I don't get too unbiased ;-)
<anteaya> so with swift, would you use a db ontop of swift
<notmyname> no, never
<anteaya> and I understand, thank you for your unbiasedness
<anteaya> so use swift instead of a db?
<notmyname> no, not "instead of". well, yes, but they are completely different use cases
<anteaya> I'm sorry I've been spending my time on nova and glance
<notmyname> most people have ;-)
* lloydde has quit (Remote host closed the connection)
<anteaya> my knowledge of swift is well, weak is an understatement
<notmyname> for example, use a DB for storing how your customers relate to you product matrix and billing info. use swift for storing your product images, your customer images, and copies of your invoices
<notmyname> and backups of your DBs :-)
* dolphm (~dolphm@cpe-67-10-140-29.satx.res.rr.com) has joined #openstack-101
<anteaya> ah okay
<anteaya> so the dbs would just run as part of nova then
<notmyname> I think there are several ways you could do that. so, "yes, but not necessarily"
<anteaya> and when you had something large, like images and backups, those are stored on swift
<anteaya> okay so that would be one option
<notmyname> yes, where large is either lots of bytes or lots of pieces of data (or both!)
<anteaya> okay this is good, thank you
* dolphm has quit (Remote host closed the connection)
<notmyname> you could use swift to store backups. but you can also use it to store 10 billion pictures
<anteaya> so what business would need large sizes or pieces of data stored and accessible and chose availability over consistency?
<anteaya> okay so 10 billion pictures
<anteaya> the film industry
<anteaya> or the graphics industry for film
<notmyname> if you go to wikipedia right now, any image you look at comes from their swift cluster
<anteaya> okay, good example
<notmyname> (that's the "lots of small images" use case)
<anteaya> yes, wikimedia does run on openstack
<anteaya> they are also one of the companies with interns right now
<notmyname> another example may be mobile apps. you have millions of devices using the storage cluster at the same time. scaling that is something that swift is designed to handle
<anteaya> so yes, high availability is required in that case
<anteaya> so swift scales easily?
<anteaya> compared to other solutions?
<notmyname> ya, and consistency doesn't matter as much (it doesn't matter that a picture of an apple get updated before a picture of the mars rover even if the requests come in in a different order)
<anteaya> right
<anteaya> that makes sense
<anteaya> a good example
<notmyname> swift has no single point of failure and so scaling it is a matter of adding more servers (horizontal scaling). swift has a modular design that allows you to scale the different pieces independently
<anteaya> helpful
<anteaya> so if a swift server goes down, do the other servers have access to the same information?
<notmyname> yes. actually storing data on a drive is pretty easy. swift does 2 things: 1) places your data to prevent a single hardware fault from losing your data or making it unavailable and 2) handling hardware failures
<anteaya> interesting
<anteaya> I am really glad to get this overview
* annegentle (~annegentl@99-23-193-214.lightspeed.austtx.sbcglobal.net) has joined #openstack-101
<notmyname> swift stores replicas of your data. you can configure at a cluster level how many replicas are stored. in the case of failure, swift's active consistency processes will ensure that you have the full durability of your data
<notmyname> there's a ton of great info at http://swiftstack.com/openstack-swift/ :-)
<anteaya> I really appreciate the intro
<anteaya> thanks so much for your time, notmyname
<notmyname> it's why I joined this channel ;-)
<anteaya> thank you
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment