Skip to content

Instantly share code, notes, and snippets.

@PharkMillups
Created June 25, 2010 14:25
Show Gist options
  • Select an option

  • Save PharkMillups/452913 to your computer and use it in GitHub Desktop.

Select an option

Save PharkMillups/452913 to your computer and use it in GitHub Desktop.
gotascii # question about terminology/definitions, does throughput impact
availability? i.e. if a system is at maximum throughput and some users
aren't able to submit requests is it still "available" as far as
"highly-available" is concerned? possibly that is something an SLA
would clarify? "available" would need to be defined in terms of
throughput...just trying to wrap my head around some of this heh
gotascii # basically, i think this talk here
http://www.royans.net/arch/talk-on-database-scalability/
and this blog post
http://www.royans.net/arch/what-is-scalability/
confused me a bit as far as scalability goes specifically statements
like "Scalability is not improving latency, but increasing throughput"
it seems to me like throughput is just one aspect of a system you might
want to scale
benblack # true
gotascii # availability, storage capacity...a long list of other
system properties... something like "If you can’t figure out how
to improve performance while scaling out, its okay." doesn't seem true
if you are trying to scale out your performance
benblack # i think you'll find you can't scale out latency
only throughput
gotascii # does that make sense? can you scale out performance?
something like, i want to be able to add another machine and make my
some operation execute faster
benblack # scalability and scale out are not congruent
gotascii # you can scale up latency though, is that right?
benblack # right
gotascii # like theres no way to add another node to the cluster
and make a process execute faster?
benblack # if you are overloaded, maybe
gotascii # what about something like a map reduce query?
benblack # what about it? latency _per operation_ unchanged
latency for entire mr job reduced...because you have more throughput
gotascii # ok makes sense, i wasnt understanding the granularity of "operation"
throughput of each operation is increased so the total job time,
job meaning a batch of operations, decreases
that is tricky because people talk about throughput in terms of
how many users can access the system at once
benblack # that is not throughput
gotascii # when a user might be issuing a job that has many operations...i
do see now that those things are equivalent
benblack # throughput is operations per unit time
latency is time per operation
gotascii # awesome. ok that helps A LOT
gotascii # so going back to this statement "Scalability is not improving latency,
but increasing throughput" is correct in the first half, and too narrow in the second half?
scalability is not improving latency, but about increasing _some scalable system property_
one of which is throughput thanks for explaining than ben, much appreciated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment