Skip to content

Instantly share code, notes, and snippets.

@ahadshafiq
Created April 13, 2013 16:41
Show Gist options
  • Save ahadshafiq/5379124 to your computer and use it in GitHub Desktop.
Save ahadshafiq/5379124 to your computer and use it in GitHub Desktop.
Some notes on Performance
--------------------------------------------------
Server Stack
--------------------------------------------------
As with other *nix applications, you can control nginx through the use
of signals. To use signals, you’ll need the process ID.
nginx is architected as a single master
process with any number of worker processes.
You can stop the master process with the
kill 15 signal (kill -15 6850), which will also kill the worker processes. If
you change a configuration file, you can reload it without restarting
nginx:
ezra$# The -c option is only needed if your .conf file is in a custom location.
ezra$ sudo /usr/local/nginx/sbin/nginx -c /etc/nginx/nginx.conf
ezra$ sudo kill -HUP 614
The first command sets up the new configuration file. The second sends
a kill signal with HUP. The HUP signal is a configuration reload signal.
If the configuration is successful, nginx will load the new configuration
into new worker processes and kill the old ones. Now you know enough
to run the server, stop the server, and configure the server. The next
step is building a configuration file.
Then why do we use Nginx in front ? Let's consider we use only Mongrel :
we fire a mongrel instance, listening on the port 80. If your requests takes
for example 500 ms to complete, you can handle 2 clients per second any nothing
more. But wait that's clearly not enough. Let's fire another mongrel instance.
But we can't have it listen on the port 80 since it's already used by the first
instance and there's nothing we can do about it.
So we need somehting in front that can handle multiple Mongrel instances, by still
listening the port 80. You throw in a Nginx server, that will (proxy) dispatch the
requests to your many mongrel instances and you can now add more instances to serve
more clients simultaneously.
--------------------------------------------------
PostgreSQL
--------------------------------------------------
Increase the parameter shared_buffers, which controls the amount of memory PostgreSQL uses
for its private buffer cache.
The most important memory-allocation parameter is work_mem. The default value of 1MB allows
any sort, hash join, or materialize operation to use up to 1MB of physical memory. Larger
operations will use a less efficient algorithm that allows data to spill to disk. Raising
this value can dramatically improve the performance of certain queries, but it's important
not to overdo it.
It's also a good idea to set the related parameter maintenance_work_mem, which controls
the amount of physical memory PostgreSQL will attempt to use for maintenance operations,
such as routine vacuuming and index creation.
you should increase the default value of wal_buffers, which defaults to 64kB. Although
even this very small setting does not always cause a problem, there are situations where it
can result in extra fsync calls, and degrade overall system throughput. Increasing this value
to 1MB or so can alleviate this problem.
Increasing the checkpoint_segments parameter, which defaults to 3, can dramatically improve
performance during bulk data loads. A reasonable starting value is 30.
It also makes sense to increase checkpoint_completion_target, which defaults to 0.5, to 0.9;
this will decrease the performance impact of checkpointing on a busy system (but is ineffective
for small values of checkpoint_segments, which is why the default is 0.5).
Finally, increasing checkpoint_timeout from 5 minutes to a larger value, such as 15 minutes,
can reduce the I/O load on your system, especially when using large values for shared_buffers.
The downside of making these adjustments is that your system will use a modest amount of additional
disk space, and will take longer to recover in the event of a crash. However, for most users,
this is a small price to pay for a significant performance improvement.
The parameters random_page_cost and seq_page_cost, control the planner's estimate of how expensive
it will be to obtain each database page. The default values assume very little caching, so it's
frequently a good idea to reduce them. Even if your database is significantly larger than physical
memory, you might want to try setting these parameters to 2 and 1 (rather than the default values
of 4 and 1) to see whether you get better query plans that way. If your database fits entirely within
memory, you can lower these values much more, perhaps to 0.1 and 0.1. Never set random_page_cost less
than seq_page_cost, but consider setting them equal (or very close to equal) if your database fits
mostly or entirely within memory.
You should also configure the parameter effective_cache_size. Despite being measured in megabytes,
this parameter does not allocate any memory. Instead, it is used by the query planner to estimate
certain caching effects. When this parameter is set too low, the planner may decide not to use an
index even when it would be beneficial to do so. An appropriate value is approximately 75% of physical memory.
Finally, for best performance, it's a good idea to consider setting the synchronous_commit parameter to off.
When this parameter is turned off, an unexpected crash or power failure could result in the loss of a
transaction that was reported to the client as committed. For financial or other mission-critical applications,
this is unacceptable, and the default value of on should be retained. However, many web applications can
tolerate the loss of a few seconds with of updates in the event of a crash, and the performance gain from
changing this setting can be massive.
--------------------------------------------------
Ruby
--------------------------------------------------
http://www.rubyenterpriseedition.com/documentation.html#_garbage_collector_performance_tuning
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment