Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save matiasepalacios/5852849 to your computer and use it in GitHub Desktop.
Save matiasepalacios/5852849 to your computer and use it in GitHub Desktop.
The worst performing resource in the cloud (and most annoying to scale) is disk i/o. Eliminate disk i/o wherever possible.
Quick rundown on how you might use AWS services:
http://magento.stackexchange.com/questions/459/running-magento-in-an-aws-environment/464#464
Rundown on how get static assets off of an individual server (use OnePica CDN extension):
http://magento.stackexchange.com/questions/462/magento-media-assets-in-amazon-s3?rq=1
for MySQL, these two are pretty key to change from the RDS defaults, but involve tradeoffs in durability (which IMO are rendered moot by multi-AZ replication in RDS) :
thread_cache_size to 16 and innodb_flush_log_at_trx_commit to 0
http://stackoverflow.com/questions/10458095/innodb-bottleneck-relaxing-acid-to-improve-performance
http://dba.stackexchange.com/questions/12611/is-it-safe-to-use-innodb-flush-log-at-trx-commit-2
Apache access and error logging is actually a big bottleneck. I recommend Papertrailapp.com and include installation instructions below but any 'log to network' solution will work. Disable any logging to disk at /var/log and log to network instead.
The other big bottleneck is indexing. Unfortunately I don't have a lot of experience to share around indexing large numbers of products, but Magento is pretty good for that use case out of the box. For indexing large numbers of stores, try lazy loading the index on a per-store basis, this generally results in a 5-10 second performance hit for the first user to a store for your indexing interval but isn't too bad.
https://papertrailapp.com/systems/setup
# sudo sh
# cd /etc
# wget https://papertrailapp.com/tools/syslog.papertrail.crt
# yum install rsyslog-gnutls
^^ If syslogd/sysklogd is installed and conflicts with the above command, follow these steps:
http://aws.amazon.com/amazon-linux-ami/2011.02-release-notes/
set hostname in /etc/rsyslog.conf
# vi /etc/rsyslog.conf
$LocalHostName unique-name-for-host
# cat > /etc/rsyslog.d/papertrail.conf
$ActionQueueType LinkedList # memory queue (when used without other options)
$ActionQueueSize 100000 # this determines how much memory to use, 100k is prob enough for c1.medium
$ActionQueueTimeoutEnqueue 500 # make new queue entries time out after 500ms
$ActionResumeRetryCount -1 # infinite retries if host is down
$DefaultNetstreamDriverCAFile /etc/syslog.papertrail.crt # trust these CAs
$DefaultNetstreamDriver gtls # use gtls netstream driver
$ActionSendStreamDriverMode 1 # require TLS
$ActionSendStreamDriverAuthMode x509/name # authenticate by hostname
*.* @@logs.papertrailapp.com:18069
# /etc/init.d/rsyslog restart
ok, this papertrail.conf below doesnt work as well as i had hoped, have to comment out ErrorLog and CustomLog from every vhost and httpd.conf...
# cat > /etc/httpd/conf.d/papertrail.conf
ErrorLog syslog:local1
CustomLog |/usr/local/bin/pipe_syslog combined
http://www.oreillynet.com/pub/a/sysadmin/2006/10/12/httpd-syslog.html
note that this pipe_syslog passes in 'apache' for the process, so it's not as generic as i'd like. probably can pass in an argument in the CustomLog directive but need to experiment some.
might want to use this instead: http://blog.papertrailapp.com/post/12137863501/send-apache-access-logs-to-remote-syslog-in-1-line
# cat > /usr/local/bin/pipe_syslog
#!/usr/bin/perl
use Sys::Syslog qw( :DEFAULT setlogsock );
setlogsock('unix');
openlog('apache', 'cons', 'pid', 'local2');
while ($log = <STDIN>) {
syslog('notice', $log);
}
closelog
# chmod +x /usr/local/bin/pipe_syslog
also modify combined log format
# vi /etc/httpd/conf/httpd.conf
search for 'combined'
replace %h with %{X-Forwarded-For}i
# /usr/sbin/apachectl graceful
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment