Remember that this is a brain dump of the current server setup state. It is in a big refactor to have every particularities handled automatically but isn’t ready to be rolled out until end of January.
Updated on Mar 20 2015
DONE.
Except for webat25.org
From time to time I open all those links in tabs and I can get a quick overview whether or not all is fine. Its a poorman’s uptime status check that I do until I get better metrics.
New production has more fine-grained checks overall. Refer to Reports to review system status.
Nevertheless, here are a few sanity checks;
- https://stats.webplatform.org/index.php?module=API&method=VisitsSummary.get&idSite=1&period=day&date=today&format=JSON&token_auth=REPLACEME (use
token_auth
provided in an email you should have received) - http://www.webplatform.org/talk/chatlogs/#home
- http://blog.webplatform.org/2013/02/pointing-toward-the-future/
- http://project.webplatform.org/blog
- http://www.webat25.org/news/the-web-at-25-reflections-from-germany
- https://notes.webplatform.org/
- https://docs.webplatform.org/test/css/properties/border-radius (test wiki)
- https://docs.webplatform.org/wiki/css/properties/border-radius (live wiki)
- https://docs.webplatform.org/t/api.php?action=query&meta=siteinfo&siprop=statistics
- https://docs.webplatform.org/w/api.php?action=query&meta=siteinfo&siprop=statistics
- https://docs.webplatform.org/w/api.php?action=query&list=recentchanges&rcprop=user%7Cparsedcomment%7Cflags%7Ctimestamp%7Ctitle%7Csizes%7Credirect%7Cids%7Cloginfo&rclimit=10
- https://accounts.webplatform.org/
- https://profile.accounts.webplatform.org/
- https://api.accounts.webplatform.org/
- https://oauth.accounts.webplatform.org/
- https://docs.webplatform.org/compat/data.json
- https://docs.webplatform.org/compat/data-human.json
- EVERY VMs (except
mail
, in both new and old production) runs exim4, and relays tomail.webplatform.org
, see Accessing a VM through SSH in the new documentation - ElasticSearch is ONLY required by Hypothesis, nothing else yet.
- Any non vital, or migrated to the new cluster, VMs are stopped
- To see which VMs runs, from
salt.webplatform.org
, usenova list
Should be handled automatically just fine
backup
role VM type rsyncs from root cronjob what is on both hostssalt
ANDmasterdb
- db1-masterdb:
- what: MySQL databases
- Crontabs defined in:
salt:/srv/salt/backup/db.sls
- Script:
/usr/local/sbin/db.sh
as root cronjob
- db1-masterdb:
For logs, refer to Centralized logging in the new documentation. Both new and old production has receives logs through UDP and the documentation is valid in both clusters.
Other poking in new production can be done by following what’s described in Reports to review system status.
htop
sar // sysstat is only in old infrastructure
netstat -tulpn
lsof -P -i -n | cut -f 1 -d " "| uniq | tail -n +2
lsof -P -i -n
lsof -w -l
initctl list | grep running
netstat -ntu | awk '{print $5}' | cut -d: -f1 | sort | uniq -c | sort -n
Note, those commands are my favourite "lazy" commands aliases I’m gathering. They are available as either wpd-lsof
(they’ll be renamed as lazy-lsof
). To get them, you can look in /etc/profile.d/wpd_aliases.sh
in both old and new environments.
New production has more fine-grained metrics and uptime checks system. Refer to Reports to review system status.
As said in The Salt Master, at Centralized logging its not an ideal solution. It should be fixed by webplatform/ops#117.
appN
,webat25
runs it- To check apache health, look at the script server_stats_tunnels.sh below and use Apache
/server-status
from tunnel - The script
wpd-apache-watchdog
uses it to see if apache2 is running, logs restarts in/tmp/apache-watchdog.log
. See script belowapache-watchdog.sh
New production has more fine-grained apache/nginx checks. Refer to Reports to review system status.
Some VMs has Monit to ensure services are UP and restarts it for us. It will do basicall what apache-watchdog
does, but won’t be limited to checking if an HTTP server responds on localhost port 80 and restart the apache service.
New production has Monit accros the board. Refer to Reports to review system status.
Moved into new production.
Refer to Reports to review system status.
Refer to Roles and environment levels page.
See also new architecture documentation for app
Web application generic application container. Currently runs from Ubuntu 14.04, serving HTTP requests from Apache2 2.4.x with MPM Prefork.
Note salt
commands below are examples to run them against app11
. To deploy to production; you will have to deploy on each other app nodes available. To list them, run salt-run manage.status
on salt VM.
- Code:
- Deploy all code new production:
wpd-deploy app
- Homepage:
- VHost:
/etc/apache2/sites-enabled/00-webplatform.conf
- DocRoot:
/var/www
- Deploy command:
salt app11 state.sls code.root
- Salt Master code deployment sls:
salt:/srv/salt/code/root.sls
- Salt Master code clone:
salt:/srv/code/www
- VHost:
- MediaWiki:
- VHost:
/etc/apache2/sites-enabled/01-docs.conf
- DocRoot:
/srv/webplatform/wiki/wpwiki
- Salt Master deploy command:
salt app11 state.sls code.docs_nextgen
- Salt Master code deployment sls:
salt:/srv/salt/code/docs_nextgen.sls
- Salt master code clone:
salt:/srv/code/docs/nextgen
- VHost:
- WebPlatform.com:
- VHost:
/etc/apache2/sites-enabled/05-webplatform-com.conf
- DocRoot:
/srv/webplatform/webplatform-com/out
- Salt Master deploy command:
salt app11 state.sls code.root-com
- Salt Master code deployment sls:
salt:/srv/salt/code/root-com.sls
- Salt master code clone: None, only a static file in
salt:/srv/salt/code/files/root-com/index.html
- VHost:
- Dabblet:
- VHost:
/etc/apache2/sites-enabled/09-dabblet.conf
- DocRoot:
/srv/webplatform/dabblet
- Salt Master deploy command:
salt app11 state.sls code.dabblet
- Salt Master code deployment sls:
salt:/srv/salt/code/dabblet.sls
- Salt master code clone:
salt:/srv/code/dabblet/
- VHost:
- LumberJack Web UI:
- VHost: An alias in
/etc/apache2/sites-enabled/00-webplatform.conf
- DocRoot:
/srv/webplatform/bots/lumberjack
- Salt master code clone:
salt:/srv/code/bots/lumberjack/
- Note: Nothing should be needed to change here, its a sketchy zone for now.
- VHost: An alias in
- Deploy all code new production:
- Relies on VMs (service):
- db (MySQL server, for LumberJack Web UI)
- memcache (Memcache)
- Health checks in MediaWiki below, only a
wpd-apache-watchdog
script through cron
See also new architecture documentation for accounts
Its the upcoming accounts system we are using, currently only in use for notes.webplatform.org. Software is a fork of Mozilla Firefox Accounts (a.k.a. FxA).
- Code (Listed in prefered startup order):
- Deploy all code:
salt accounts state.sls code.accounts
- VHost:
/etc/nginx/sites-enabled/accounts
- OAuth:
- DocRoot:
/srv/webplatform/auth/fxa-oauth-server
- Init script:
/etc/init/fxa-oauth-server.conf
- Restart command:
monit restart fxa-oauth-server
- DocRoot:
- Auth:
- DocRoot:
/srv/webplatform/auth/fxa-auth-server
- Init script:
/etc/init/fxa-auth-server.conf
- Restart command:
monit restart fxa-auth-server
- DocRoot:
- Content:
- DocRoot:
/srv/webplatform/auth/fxa-content-server
- Init script:
/etc/init/fxa-content-server.conf
- Restart command:
monit restart fxa-content-server
- DocRoot:
- Profile:
- DocRoot:
/srv/webplatform/auth/fxa-profile-server
- Init script:
/etc/init/fxa-profile-server.conf
- Restart command:
monit restart fxa-profile-server
- DocRoot:
- Deploy all code:
- Local services:
- fxa-oauth-server
- fxa-auth-server
- fxa-content-server
- nginx
- monit
- Relies on VMs (service):
masterdb
(mysql)
- Health checks in Accounts below
See also new architecture documentation for notes
- Code:
- Hypothesis:
- VHost:
/etc/nginx/sites-enabled/notes
- DocRoot:
/srv/webplatform/notes-server
- Restart command:
monit restart hypothesis-service
- Salt Master code clone: None, manual clone at the moment
- VHost:
- Hypothesis:
- Local services:
- hypothesis
- nginx
- monit
- Relies on VMs (service):
- accounts (fxa-content-server, fxa-auth-server, fxa-oauth-server, fxa-profile-server)
- elastic1 (elasticsearch)
- Health checks in Hypothesis below
See also new architecture documentation for blog
- Code:
- Deploy all code:
wpd-deploy blog
- WordPress:
- VHost:
/etc/apache2/sites-enabled/blog
- DocRoot:
/srv/webplatform/blog/current
- VHost:
- Deploy all code:
See also new architecture documentation for accounts
- Code:
- Deploy all code:
wpd-deploy project
- BugGenie:
- VHost:
/etc/apache2/sites-enabled/buggenie
- DocRoot:
/srv/webplatform/buggenie
- Salt master code clone:
salt:/srv/code/buggenie/
- VHost:
- Deploy all code:
See also new architecture documentation for bots
It only runs a custom Python IRC logger that was called LumberJack, now known as Pierc. But we are using our own fork. That service will be phased out with something else soon.
There are two components, a web viewer (in php) and a lister (in Python)
- Code:
- LumberJack:
- Clone:
/srv/webplatform/lumberjack
- Init script:
/etc/init/lumberjack.conf
- Restart command:
service lumberjack restart
- Clone:
- LumberJack:
- Local services:
- lumberjack
Not migrated. It wont be, will run as is until end of May.
Do not invest anything here. The full site will be exported as static site in a few months.
- Code:
- ExpressionEngine:
- VHost:
/etc/apache2/sites-enabled/buggenie
- DocRoot:
/srv/webplatform/web25ee/
- Salt master code clone:
salt:/srv/code/web25ee/
- VHost:
- ExpressionEngine:
- Relies on VMs (service):
db4
(mysql)memcacheN
(Memcached) see/etc/php5/conf.d/memcached.ini
- Health checks in ExpressionEngine below, only a
wpd-apache-watchdog
script through cron
Refer to Roles and environment levels page. Concepts are the same in both old and new production.
Let’s keep those notes in case of need;
- Hosted on VMs with role
app
- Typical URLs:
- Main wiki is docs.webplatform.org/docs/ is the main one, called
wpwiki
- Test wiki is docs.webplatform.org/test/ is the main one, called
wptestwiki
- Main wiki is docs.webplatform.org/docs/ is the main one, called
- Exposed by Fastly, to test and see associations refer to server_stats_tunnels.sh and hosts.txt below
- Main wiki config on Salt Master server (salt.webplatform.org) is
/srv/salt/code/files/docs/wpwiki.php.jinja
- gets renamed as
/srv/webplatform/wiki/wpwiki/LocalSettings.php
onappN
VMs - Handled by salt state in
/srv/salt/code/docs_nextgen.sls
- gets renamed as
- File
/srv/webplatform/wiki/Settings.php
is called by both wikis (wpwiki, wptestwiki)- Deployment server is
/srv/salt/code/files/docs/Settings.php.jinja
- gets renamed as
/srv/webplatform/wiki/Settings.php
onappN
VMs - Handled by salt state in
/srv/salt/code/docs_nextgen.sls
- Can be called like this
salt app8 state.sls code.docs_nextgen
- Deployment server is
- Main wiki config file is in
/srv/webplatform/wiki/wpwiki/LocalSettings.php
which sets database config and how to get image uploads - To check apache health, look at the script server_stats_tunnels.sh below and use Apache
/server-status
from tunnel - Health checks:
root
crontab runs/usr/local/sbin/wpd-apache-watchdog
every 2 minutes, restarts are logged in/tmp/apache-watchdog.log
, seeapache-watchdog.sh
below.
-
Hosted on VMs with role
notes
-
Typical URL is notes.webplatform.org
-
NOT Exposed by Fastly, to get IPs use
nova list
fromsalt.webplatform.org
-
Served directly from NGINX
-
Configs:
/srv/webplatform/notes-server/production.ini
-
Health checks through Monit:
root@accounts:~# monit summary The Monit daemon 5.6 uptime: 1d 11h 44m System 'notes.webplatform.org' Running Remote Host 'elasticsearch-remote' Online with all services Remote Host 'hypothesis-service' Online with all services Process 'nginx' Running File 'nginx_bin' Accessible File 'nginx_rc' Accessible
Checks configs are described in
/etc/monit/conf.d/hypothesis
.
-
Hosted on VMs with role
accounts
-
NOT Exposed by Fastly, to get IPs use
nova list
-
Served directly from NGINX
-
Typical URLs:
- accounts.webplatform.org (a.k.a.
fxa-content-server
) - oauth.accounts.webplatform.org (a.k.a.
fxa-oauth-server
) - api.accounts.webplatform.org (a.k.a.
fxa-auth-server
) - profile.accounts.webplatform.org (a.k.a.
fxa-profile-server
)
- accounts.webplatform.org (a.k.a.
-
Configs:
fxa-content-server
:/srv/webplatform/auth/fxa-content-server/server/config/production.json
fxa-auth-server
:/srv/webplatform/auth/fxa-auth-server/config/prod.json
fxa-profile-server
:/srv/webplatform/auth/fxa-profile-server/config/prod.json
fxa-oauth-server
:/srv/webplatform/auth/fxa-oauth-server/config/prod.json
-
Health checks, through Monit:
root@accounts:~# monit summary The Monit daemon 5.6 uptime: 4h 20m Remote Host 'fxa-profile-server' Online with all services Program 'fxa-profile-server-check' Status ok Remote Host 'fxa-oauth-server' Online with all services Remote Host 'fxa-content-server' Online with all services Remote Host 'fxa-auth-server' Online with all services System 'accounts.webplatform.org' Running Process 'nginx' Running File 'nginx_bin' Accessible File 'nginx_rc' Accessible
Checks configs are described in
/etc/monit/conf.d/*
.
- Hosted on VMs with role
blog
- Typical URL is blog.webplatform.org/docs/
- Exposed by Fastly, to test and see associations refer to server_stats_tunnels.sh
- Configs:
- Main config:
/srv/webplatform/blog/current/wp-config.php
- Code in VM:
/srv/webplatform/blog/current/
- Code in Deployment: none. Its currently a
git clone
from WordPress GitHub mirror, theme in Deployment:/srv/code/blog/webplatform-wordpress-theme/
as/srv/webplatform/blog/current/wp-content/themes/webplatform/
- Main config:
- Health checks:
root
crontab runs/usr/local/sbin/wpd-apache-watchdog
every 2 minutes, restarts are logged in/tmp/apache-watchdog.log
, seeapache-watchdog.sh
below.
- Hosted on VMs with role
project
- Typical URL is project.webplatform.org
- Exposed by Fastly, to test and see associations refer to server_stats_tunnels.sh
- Configs:
/srv/webplatform/buggenie/core/b2db_bootstrap.inc.php
/srv/webplatform/buggenie/installed
(if you have to reinstall, BugGenie checks this)
- Health checks:
root
crontab runs/usr/local/sbin/wpd-apache-watchdog
every 2 minutes, restarts are logged in/tmp/apache-watchdog.log
, seeapache-watchdog.sh
below.
- Hosted on VMs with role
bots
(listener) - Hosted on VMs with role
app
(web viewer) - Typical URL is www.webplatform.org/talk/chatlogs
- Exposed by Fastly, to test and see associations refer to server_stats_tunnels.sh
- Two components:
- Web UI, hosted on
appN
VMs - Listener daemon running on
bots
VM, asLumberJack
- Web UI, hosted on
- Configs:
appN:/srv/webplatform/bots/lumberjack/config.php
bots:/srv/webplatform/lumberjack/mysql_config.txt
- Health checks:
root
crontab runs/usr/local/sbin/wpd-apache-watchdog
every 2 minutes, restarts are logged in/tmp/apache-watchdog.log
, seeapache-watchdog.sh
below.
Not migrated. Won’t be.
- Hosted on
webat25
(only one, will be replaced by a static version after holidays) - Typical URL are:
- www.webat25.org through Fastly
- ee.webat25.org (only for EE CMS, no caching)
- Exposed by Fastly, to test and see associations refer to server_stats_tunnels.sh and hosts.txt below
- Configs:
/srv/webplatform/web25ee/backoffice/expressionengine/config/database.php
/srv/webplatform/web25ee/backoffice/expressionengine/config/config.php
- Health checks:
root
crontab runs/usr/local/sbin/wpd-apache-watchdog
every 2 minutes, restarts are logged in/tmp/apache-watchdog.log
, seeapache-watchdog.sh
below.