Shared modules allow data scientists to implement complex functionality as virtual functions, classes, installable scripts, or packaged apps.
Consider a use case where we want to leverage sqlalchemy's ORM Models
Sign up to Atlasdb https://cloud.mongodb.com/
Create new MongoDB
Connect > Connect your application > Python > 3.4
Copy/paste the first server URL:
e.g.
""" omega-ml bulk deployment utility | |
(c) 2020 one2seven GmbH, Switzerland | |
Enables deployment of datasets, models, scripts, jobs as well as cloud | |
resources from a single configuration file. This is currently a separate | |
utility that will be integrated into the omega-ml cli. | |
Installation: | |
$ pip install -U getgist omegaml==0.14.0 | |
$ getgist omegaml omdeploy |
Dedicated clusters run omega-ml in a customer-owned cloud account (AWS, Azure, Exoscale, any other cloud provider or on-premise). Each cluster is a fully deployed kubernetes cluster with the following Rancher projects and namespaces:
def put_longname(store, obj, name, **kwargs): | |
""" helper function to overcome mongodb limitation on namespace length | |
Only use to store Pandas dataframes and series | |
Usage: | |
Copy/paste this function into your code base | |
# instead of | |
meta = om.datasets.put(obj, name) |
""" | |
omegaml plugin to chain runtime tasks | |
Usage: | |
# this will chain the fit and predict, i.e. fit will run only on predict success | |
with om.runtime.chain() as crt: | |
crt.model('regmodelx').fit('sample[y]', 'sample[x]') | |
crt.model('regmodelx').predict([5], rName='foox') | |
result = crt.run() | |
Motivating example: Say you have a file 'test.xyz' that you want to read and write in your web application. That's not typically a supported scenario in a containerized application as local storage is emphemeral.
For this purpose, om.datasets provides the python.file
object kind:
This nbtasks plugin, built for the omega|ml runtime, let's you run a Jupyter notebook many times with a different set of parameters. Essentially like Python's multiprocessing Pool.map() for running Jupyter Notebooks in the cloud.
# run the 'mynb' notebook 10 times
# -- each notebook gets one value of the range
job = om.runtime.job('mynb')
job.map(range(10))
""" | |
This fixes MDataFrame.merge() where the join results in duplicate object keys | |
When to use: | |
if your mdf.merge() operation results in a duplicate key error exception | |
Usage: | |
!pip install -q getgist | |
!rm -f *omx_qfmdfmerge.py && getgist -y omegaml omx_qfmdfmerge.py | |
import omx_qfmdfmerge |
def getinfo(om, models=None, datasets=None, scripts=None, jobs=None, outfile=None, tryget=False): | |
""" | |
get info relevant to provide support | |
Usage: | |
import omegaml as om | |
# to get a printed report | |
print(getinfo(om, ...)) | |