-
-
Save Alex3917/9b09e67069c88361d4e0187e1968bc64 to your computer and use it in GitHub Desktop.
# In .ebextensions/01_celery.config | |
files: | |
"/etc/systemd/system/celery.service": | |
mode: "000644" | |
owner: celery | |
group: celery | |
content: | | |
[Unit] | |
Description=Celery Service | |
After=network.target | |
[Service] | |
# I saw some other tutorials suggesting using Type=simple, but that didn't work for me. Type=forking works | |
# as long as you're using an instance with at least 2.0 Gigs of RAM, but on a t2.micro instance it was running out | |
# of memory and crashing. | |
Type=forking | |
Restart=on-failure | |
RestartSec=10 | |
User=celery | |
Group=celery | |
# You can have multiple EnvironmentFile= variables declared if you have files with variables. | |
# The celery docs on daemonizing celery with systemd put their environment variables in a file called | |
# /etc/conf.d/celery, but I'm choosing to instead set the celery variables as environment variables so that | |
# celery can also access the necessary variables for interacting with Django. | |
EnvironmentFile=/opt/elasticbeanstalk/deployment/env | |
WorkingDirectory=/var/app/current | |
ExecStart=/bin/sh -c '${CELERY_BIN} multi start worker \ | |
-A ${CELERY_APP} --pidfile=${CELERYD_PID_FILE} \ | |
--logfile=${CELERYD_LOG_FILE} --loglevel=INFO --time-limit=300 --concurrency=2' | |
ExecStop=/bin/sh -c '${CELERY_BIN} multi stopwait worker \ | |
--pidfile=${CELERYD_PID_FILE}' | |
ExecReload=/bin/sh -c '${CELERY_BIN} multi restart worker \ | |
-A ${CELERY_APP} --pidfile=${CELERYD_PID_FILE} \ | |
--logfile=${CELERYD_LOG_FILE} --loglevel=INFO --time-limit=300 --concurrency=2' | |
[Install] | |
WantedBy=multi-user.target | |
"/etc/tmpfiles.d/celery.conf": | |
mode: "000755" | |
owner: celery | |
group: celery | |
content: | | |
d /var/run/celery 0755 celery celery - | |
d /var/log/celery 0755 celery celery - | |
container_commands: | |
01_create_celery_log_file_directories: | |
command: mkdir -p /var/log/celery /var/run/celery | |
02_give_celery_user_ownership_of_directories: | |
command: chown -R celery:celery /var/log/celery /var/run/celery | |
03_change_mode_of_celery_directories: | |
command: chmod -R 755 /var/log/celery /var/run/celery | |
04_reload_settings: | |
command: systemctl daemon-reload | |
# In .platform/hooks/postdeploy/01_start_celery.sh | |
#!/bin/bash | |
(cd /var/app/current; systemctl stop celery) | |
(cd /var/app/current; systemctl start celery) | |
(cd /var/app/current; systemctl enable celery.service) |
I was going by these issues:
celery/celery#6304
celery/celery#6285
To see if Celery is running, first you need to activate the virtualenv on your instance after ssh'ing in:
cd /var/app/current && . /var/app/venv/staging-LQM1lest/bin/activate
And then you can just do: celery inspect active
If you put your log files in the same place as the script, you can check all the log files in that folder by doing: tail /var/log/celery/*.log
edit: I think you also need to export the environment variable DJANGO_SETTINGS_MODULE to get this to work.
You mean, export DJANGO_SETTINGS_MODULE in option settings like this ? :
option_settings:
"aws:elasticbeanstalk:application:environment":
DJANGO_SETTINGS_MODULE: "config.settings.production"
I downgrade to 4.4.6, no more error "Cannot connect to amqp://guest:**@127.0.0.1:5672// ...." but the service is stuck:
sept. 18 17:03:42 ip-172-31-17-176.eu-west-3.compute.internal systemd[1]: Failed to start Celery Service.
sept. 18 17:03:42 ip-172-31-17-176.eu-west-3.compute.internal systemd[1]: Unit celery.service entered failed state.
sept. 18 17:03:42 ip-172-31-17-176.eu-west-3.compute.internal systemd[1]: celery.service failed.
sept. 18 17:03:42 ip-172-31-17-176.eu-west-3.compute.internal systemd[1]: Starting Celery Service...
And I don't know why, no logs ... :(
So you do need to export your settings like that, but you also need that variable in your environment when you ssh into the box. If that's all you have, then if you were to type echo $DJANGO_SETTINGS_MODULE
in your shell then you won't see anything, which is what you need to change. So if you want to do that as a one-off, you can type export DJANGO_SETTINGS_MODULE=config.settings.production
.
Alternatively, if you want (or need) to export all of your environment variables so they are accessible when you ssh into the box, you can do that by creating a script .platform/hooks/postdeploy/01_set_env.sh
that contains:
#!/bin/bash
#Create a copy of the environment variable file
cp /opt/elasticbeanstalk/deployment/env /opt/elasticbeanstalk/deployment/custom_env_var
#Set permissions to the custom_env_var file so this file can be accessed by any user on the instance. You can restrict permissions as per your requirements.
chmod 644 /opt/elasticbeanstalk/deployment/custom_env_var
#Remove duplicate files upon deployment
rm -f /opt/elasticbeanstalk/deployment/*.bak
# Turn `ENVVAR=aoeu` into `export ENVAR='aoeu'`
# Will not work if environment variables contain "="" or "'" characters
/bin/cat <<EOM >/opt/elasticbeanstalk/deployment/quote_env_vars.py
new_lines = []
with open('/opt/elasticbeanstalk/deployment/custom_env_var') as f:
for line in f:
line = line.strip('\n').replace('=', "='")
new_line = "export " + line + "'"
new_lines.append(new_line)
with open('/opt/elasticbeanstalk/deployment/custom_env_var', 'w') as f:
for new_line in new_lines:
print(new_line, file=f)
EOM
python3 /opt/elasticbeanstalk/deployment/quote_env_vars.py
Then when you ssh into the box, you can export all of your environment variables into your current shell by doing: /opt/elasticbeanstalk/deployment/custom_env_var
. Note that for security reasons, you don't really want all your secret keys to exist in files on your box, so you might want to delete this hook when you're done debugging the issue.
Not sure, do you have the .platform/hooks/postdeploy/01_start_celery.sh
file?
You can also check systemd's logs using journalctl -xe
and see if you see anything related to celery. Also make sure you have enough memory, I found it to be unstable with less than 2 gigs.
Also use netcat or whatever to make sure you can connect to whatever queue you are using. E.g. if you are using Redis on ElastiCache, you need to make sure the security group is configured correctly to allow traffic from your EC2 boxes on the right port. You can do:
nc -zv your-redis-url.amazonaws.com 5432
. But you need to do yum install nc
to get that to work.
Ok, my bad. I still was in Type=simple mode (due to tests). Now it works well (with Type=forking).
Thanks again for your great support ;)
EDIT:
There was another error in my config file. I was using the value of CELERYD_LOG_FILE directcly in the script
Script KO:
ExecStart=/bin/sh -c '/var/app/venv/*/bin/celery multi start worker \
-A intramuros.taskapp.celery:app --pidfile=/var/run/celery/worker.pid \
--logfile="/var/log/celery/%n%I.log" --loglevel=INFO --time-limit=300'
Script OK:
ExecStart=/bin/sh -c '/var/app/venv/*/bin/celery multi start worker \
-A intramuros.taskapp.celery:app --pidfile=/var/run/celery/worker.pid \
--logfile=/var/log/celery/worker.log --loglevel=INFO --time-limit=300'
The value %n%I seems to be the root cause.
@appli-intramuros Oh yeah, IIRC you need to use %%n%%l if you're trying to start celery from the command line. But if you're passing the values as a variable then I think it should work without the double percent signs. Something to do with percent signs needing to be escaped in certain contexts, I'm not entirely sure what the general rule is there.
"command: useradd -d /opt/python/celery -g celery -u 1501 celery"
Is this correct? no matter what i try i keep getting the error "ToolError: celery is not a valid user name"
using: 01_python.config:
container_commands:
00_add_user_celery:
test: test ! "id -u celery 2> /dev/null
"
command: useradd -d /opt/python/celery -g celery -u 1501 celery
ignoreErrors: false
01_migrate:
command: "source /var/app/venv/*/bin/activate && python3 manage.py migrate"
leader_only: true
02_wsgipass:
command: 'echo "WSGIPassAuthorization On" >> ../wsgi.conf'
and 02_celery.config is the exact same as what you have originally posted.
@typefox09 00_add_user_celery
needs to go under commands:
, not container_commands:
. They're different.
(But your 01_migrate
and 02_wsgipass
should go under container_commands:
.)
This is the other error "useradd: cannot create directory /opt/python/celery", hence why i asked if " /opt/python/celery" is correct as when i ssh into the shell i can't find a python folder in opt.
new 01_python.confg:
groups:
celery:
gid: "101"
commands:
00_add_user_celery:
test: test ! "id -u celery 2> /dev/null
"
command: useradd -d /opt/python/celery -g celery -u 1502 celery
ignoreErrors: false
container_commands:
01_migrate:
command: "source /var/app/venv/*/bin/activate && python3 manage.py migrate"
leader_only: true
02_wsgipass:
command: 'echo "WSGIPassAuthorization On" >> ../wsgi.conf'
why is deploying Django on elastic beanstalk so hard? Didnt have any issue with Heroku until I moved to AWS. Please, Where do I get to learn all these commands?
@bgreatfit I mean almost no one really knows these commands offhand, and they're probably not worth learning. If you understand conceptually what steps you need to do then you can Google the syntax, otherwise you can wait until someone else posts something that works for them and then try it yourself and just work through whatever errors happen. For me it was a mix of both. I wouldn't have been successfully at getting Django deployed had I tried even a couple months earlier, because not enough people had done it yet and posted how they did it. But I also had to Google a ton of stuff and spend a few days working through errors. I think it took about 5 - 7 days in total for me to go from running Django on Amazon Linux with Apache and mod_wsgi to running Django on Amazon Linux 2 with nginx and gunicorn. I agree that's more than it should take, but at the same time I wouldn't recommend spending a bunch of time learning random linux commands, that's probably not a good use of time unless you're studying to be a sysadmin or something.
container_commands:
01_migrate:
command: "source /var/app/venv/*/bin/activate && python3 manage.py migrate"
leader_only: true
02_wsgipass:
command: 'echo "WSGIPassAuthorization On" >> ../wsgi.conf'
03_permissions:
command: "chmod +x .platform/hooks/predeploy/01_pycurl_update.sh"
./platform/hooks/predeploy/01_pycurl_update.sh
#!/bin/bash
pip install pycurl==7.43.0.6
Everytime i run this i get the error :
[ERROR] An error occurred during execution of command [app-deploy] - [RunAppDeployPreDeployHooks]. Stop running the command. Error: Command .platform/hooks/predeploy/01_pycurl_update.sh failed with error fork/exec .platform/hooks/predeploy/01_pycurl_update.sh: no such file or directory
If i remove the chmod +x then i get permission denied. I am using windows but have even tried writing the .sh file in WSL and then commiting it and then redownloading it on windows to then eb deploy. Still won't work. Seems a lot of people are having this problem but there is no clear solution. I have even ssh'ed into the ec2 instance and cd to .projects and ran the script from there, works fine???
This issue occurs with every script i put in the hooks folder including your "01_start_celery.sh", this update pycurl file is just a test to get it to work as its a simple script with little text
Many thanks! Was really useful!
Some updates from that point:
- very important to have *.sh scripts used in hooks with LF instead of CRLF run (after replace <file_name.sh> with you filename:
sed -i 's/\r$//' <file_name.sh>
- another important thing is to make sure git uses LF - on windows core.autocrlf = true => LF is converted to CRLF so to make it false run:
git config --global core.autocrlf false
-
On ExecStart command I highly recommend based on celery latest docs use dynamic options
multi start worker
->multi start ${CELERYD_NODES}
. Have in elb config attributesCELERYD_NODES
:worker1 worker2 ...
easier to control. -
For celery beat I am not quite sure how to get it work - I already used:
- https://docs.celeryproject.org/en/latest/userguide/daemonizing.html
- Deployment works without any problem;
- Celery beat starts;
- Error:
- Permission denied: 'celerybeat-schedule'
- In case you already have a solution for it please let me know
-
Versions that I am using:
Django 3.2
Celery 5.0.5
Does this solution scale on EB? I know multiple Beats and the threat of leader termination make this a frustrating problem on EB without separating Celery into an entirely different environment.
@ScottHull Probably not, but I'm not sure. Lots of people have asked questions along these lines on GitHub and Stack Overflow, and no one has ever replied to any of these threads with a clear answer.
@Alex3917 Yeah its very frustrating. I managed to get Django, Celery Beat, and a Celery Worker running on a multi-container Docker environment on EB instead of relying on .ebextensions
. Going to try to set up Beat/Worker in a EB Worker Environment and remove it from my Django environment completely.
This is very helpful, thank you for taking the time to post.
I am having issues as well. When I SSH into my instance I can start celery and things seem to work fine. However, I am not able to have it start on deployment.
As a side note, I also want to figure out what is the best way to set this up properly. I do not completely understand if I'm running this in the background or how I can separate celery from the instance entirely so to avoid slow response times from my application?
@ScottHull do you have a good reference for how you set yours up you can share? Thanks in advance.
@Alex3917 @ToddRKingston I just wrote a piece about this because it was time for a definitive solution to be published by somebody. I'm using Docker and splitting off Celery into an entirely different EB instance, and then using RDS as my database and using the database as the task scheduler. Also using SQS as a queue. https://standarddev.substack.com/p/part-3-adding-scalability-django
@ScottHull Thank you for this, I currently set-up the way Alex3917 suggested above, as I do not have experience with Docker.
@Alex3917 Many thanks for your help on this, I have set this up and it is running. One question would be if I want to start beat, do you mind helping with where those commands would go? If I'm not mistaken, I just need a command to start beat (similar to starting celery)?
@ScottHull Thank you for this, I currently set-up the way Alex3917 suggested above, as I do not have experience with Docker.
@Alex3917 Many thanks for your help on this, I have set this up and it is running. One question would be if I want to start beat, do you mind helping with where those commands would go? If I'm not mistaken, I just need a command to start beat (similar to starting celery)?
Hi @ToddRKingston , did you get the celery beat working? I'm trying to accomplish the same thing.
Thanks!
Hi @Alex3917 , thank you so much for this post.
What do I need to do in order to run celery beat as well? I have a scheduled tasks that runs every midnight.
@edchelstephens I actually haven't gotten celery beat up and running yet (no need yet), but when I eventually need it I can update this.
Also thanks @ScottHull for the blog post, that's super useful! My personal hope is to be able to just keep using Python beanstalk without Docker, but given that the images aren't really getting much love recently that's super useful to have in case it's needed.
@edchelstephens I actually haven't gotten celery beat up and running yet (no need yet), but when I eventually need it I can update this.
Also thanks @ScottHull for the blog post, that's super useful! My personal hope is to be able to just keep using Python beanstalk without Docker, but given that the images aren't really getting much love recently that's super useful to have in case it's needed.
Hi @Alex3917 , thank you for the quick response! Much appreciated!
Even if you don't need it yet, can you update this post for instructions on how to configure celery beat as well?
Really badly need it.
Thanks a lot!
I was going by these issues:
celery/celery#6304 celery/celery#6285
To see if Celery is running, first you need to activate the virtualenv on your instance after ssh'ing in:
cd /var/app/current && . /var/app/venv/staging-LQM1lest/bin/activate
And then you can just do:
celery inspect active
If you put your log files in the same place as the script, you can check all the log files in that folder by doing:
tail /var/log/celery/*.log
edit: I think you also need to export the environment variable DJANGO_SETTINGS_MODULE to get this to work.
Hi @Alex3917 , I followed the instructions and manually exported the django settings module to the environment when i ssh
But when I run: celery inspect active
I get the following error:
Traceback (most recent call last):
File "/var/app/venv/staging-LQM1lest/lib/python3.8/site-packages/kombu/connection.py", line 446, in _reraise_as_library_errors
yield
File "/var/app/venv/staging-LQM1lest/lib/python3.8/site-packages/kombu/connection.py", line 433, in _ensure_connection
return retry_over_time(
File "/var/app/venv/staging-LQM1lest/lib/python3.8/site-packages/kombu/utils/functional.py", line 312, in retry_over_time
return fun(*args, **kwargs)
File "/var/app/venv/staging-LQM1lest/lib/python3.8/site-packages/kombu/connection.py", line 877, in _connection_factory
self._connection = self._establish_connection()
File "/var/app/venv/staging-LQM1lest/lib/python3.8/site-packages/kombu/connection.py", line 812, in _establish_connection
conn = self.transport.establish_connection()
File "/var/app/venv/staging-LQM1lest/lib/python3.8/site-packages/kombu/transport/pyamqp.py", line 201, in establish_connection
conn.connect()
File "/var/app/venv/staging-LQM1lest/lib/python3.8/site-packages/amqp/connection.py", line 323, in connect
self.transport.connect()
File "/var/app/venv/staging-LQM1lest/lib/python3.8/site-packages/amqp/transport.py", line 129, in connect
self._connect(self.host, self.port, self.connect_timeout)
File "/var/app/venv/staging-LQM1lest/lib/python3.8/site-packages/amqp/transport.py", line 184, in _connect
self.sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/var/app/venv/staging-LQM1lest/bin/celery", line 8, in <module>
sys.exit(main())
File "/var/app/venv/staging-LQM1lest/lib/python3.8/site-packages/celery/__main__.py", line 15, in main
sys.exit(_main())
File "/var/app/venv/staging-LQM1lest/lib/python3.8/site-packages/celery/bin/celery.py", line 217, in main
return celery(auto_envvar_prefix="CELERY")
File "/var/app/venv/staging-LQM1lest/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/var/app/venv/staging-LQM1lest/lib/python3.8/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/var/app/venv/staging-LQM1lest/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/var/app/venv/staging-LQM1lest/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/var/app/venv/staging-LQM1lest/lib/python3.8/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/var/app/venv/staging-LQM1lest/lib/python3.8/site-packages/click/decorators.py", line 26, in new_func
return f(get_current_context(), *args, **kwargs)
File "/var/app/venv/staging-LQM1lest/lib/python3.8/site-packages/celery/bin/base.py", line 134, in caller
return f(ctx, *args, **kwargs)
File "/var/app/venv/staging-LQM1lest/lib/python3.8/site-packages/celery/bin/control.py", line 136, in inspect
replies = inspect._request(action,
File "/var/app/venv/staging-LQM1lest/lib/python3.8/site-packages/celery/app/control.py", line 106, in _request
return self._prepare(self.app.control.broadcast(
File "/var/app/venv/staging-LQM1lest/lib/python3.8/site-packages/celery/app/control.py", line 741, in broadcast
return self.mailbox(conn)._broadcast(
File "/var/app/venv/staging-LQM1lest/lib/python3.8/site-packages/kombu/pidbox.py", line 328, in _broadcast
chan = channel or self.connection.default_channel
File "/var/app/venv/staging-LQM1lest/lib/python3.8/site-packages/kombu/connection.py", line 895, in default_channel
self._ensure_connection(**conn_opts)
File "/var/app/venv/staging-LQM1lest/lib/python3.8/site-packages/kombu/connection.py", line 433, in _ensure_connection
return retry_over_time(
File "/usr/lib64/python3.8/contextlib.py", line 131, in __exit__
self.gen.throw(type, value, traceback)
File "/var/app/venv/staging-LQM1lest/lib/python3.8/site-packages/kombu/connection.py", line 450, in _reraise_as_library_errors
raise ConnectionError(str(exc)) from exc
kombu.exceptions.OperationalError: [Errno 111] Connection refused
Do you know how can I fix this?
Many thanks! Was really useful! Some updates from that point:
- very important to have *.sh scripts used in hooks with LF instead of CRLF run (after replace <file_name.sh> with you filename:
sed -i 's/\r$//' <file_name.sh>
- another important thing is to make sure git uses LF - on windows core.autocrlf = true => LF is converted to CRLF so to make it false run:
git config --global core.autocrlf false
On ExecStart command I highly recommend based on celery latest docs use dynamic options
multi start worker
->multi start ${CELERYD_NODES}
. Have in elb config attributesCELERYD_NODES
:worker1 worker2 ...
easier to control.For celery beat I am not quite sure how to get it work - I already used:
https://docs.celeryproject.org/en/latest/userguide/daemonizing.html
Deployment works without any problem;
Celery beat starts;
Error:
- Permission denied: 'celerybeat-schedule'
In case you already have a solution for it please let me know
Versions that I am using:
Django 3.2
Celery 5.0.5
Hi @sorin-sabo, were you able to get this working? Can you share your configuration?
@edchelstephens No I have not gotten beat to start on deployment. I can start it manually after SSH into the instance.
@Alex3917 sorry to bother you on this, but I tried to add a file to start beat after starting celery - and after restarting the instance I'm somehow getting the same error as typefox09 'celery is not a valid user name'. Should I add this to my 01_celery.config file?
commands:
00_add_user_celery:
test: test ! "id -u celery 2> /dev/null"
command: useradd -d /opt/python/celery -g celery -u 1502 celery
ignoreErrors: false
Is this a permissions issue somehow? Any help you can provide would be very much appreciated.
@Alex3917 I've added a 01_python.config file like you mentioned in your first response to appli-intramuros, however should the path change from /opt/python/celery to something else in AL2?
I'm getting 'Created group celery successfully' but then 'Command 00_add_user_celery (useradd -d /opt/python/celery -g celery -u 1501 celery) failed' error..
When I SSH in and try to run the command 'useradd -d /opt/python/celery -g celery -u 1501 celery'
I get '-bash: /usr/sbin/useradd: Permission denied'
Thanks in advance.
@edchelstephens I was able to get beat to start on deployment. I added another file in .ebextensions/01_celery.config; similar to "/etc/systemd/system/celery.service":, but instead called "/etc/systemd/system/celerybeat.service". This is the file (underneath celery.service file):
"/etc/systemd/system/celerybeat.service":
mode: "000644"
owner: celery
group: celery
content: |
[Unit]
Description=Celery Beat Service
After=network.target
[Service]
Type=forking
Restart=on-failure
RestartSec=10
User=celery
Group=celery
EnvironmentFile=/opt/elasticbeanstalk/deployment/env
WorkingDirectory=/var/app/current
ExecStart=/bin/sh -c '/var/app/venv/staging-LQM1lest/bin/celery beat -A video.celery \
--pidfile=/tmp/celerybeat.pid \
--logfile=/var/log/celery/celerybeat.log \
--loglevel=INFO -s /tmp/celerybeat-schedule'
[Install]
WantedBy=multi-user.target
After that, I added:
(cd /var/app/current; systemctl stop celerybeat)
(cd /var/app/current; systemctl start celerybeat)
(cd /var/app/current; systemctl enable celerybeat.service)
to the 01_start_celery.sh file that had the same language for celery. Hope this helps.
Take a look at this gist. I have successfully deployed Django Rest App with celery, AWS sqs in Elastic Beanstalk with Amazon Linux 2
https://gist.github.com/MahirMahbub/f9c226dbc0a01da22c8c539cf9c9bcc9
No errors, but nothing seems to run ... logs directory is empty ... How can I check the service is running ? This is maybe my version of Celery (4.4.7). I will try to downgrade to see if somethings happen.
How do you know that Celery 4.4.7 is not working with that daemonization ?