Skip to content

Instantly share code, notes, and snippets.

@jehna
Last active September 11, 2021 04:36
Show Gist options
  • Star 28 You must be signed in to star a gist
  • Fork 7 You must be signed in to fork a gist
  • Save jehna/3b258f5287fcc181aacf to your computer and use it in GitHub Desktop.
Save jehna/3b258f5287fcc181aacf to your computer and use it in GitHub Desktop.
App Engine import data from Datastore Backup to localhost
"""
# App Engine import data from Datastore Backup to localhost
You can use this script to import large(ish) App Engine Datastore backups to your localohst dev server.
## Getting backup files
Follow instructions from Greg Bayer's awesome article to fetch the App Engine backups:
http://gbayer.com/big-data/app-engine-datastore-how-to-efficiently-export-your-data/
Basically, download and configure gsutil and run:
```
gsutil -m cp -R gs://your_bucket_name/your_path /local_target
```
## Reading data to your local (dev_appserver) application
Copy-paste this gist to your Interactive Console, set correct paths and press `Execute`.
(default: http://localhost:8000/console)
"""
from google.appengine.api.files import records
from google.appengine.datastore import entity_pb
from google.net.proto.ProtocolBuffer import ProtocolBufferDecodeError
from google.appengine.datastore import datastore_pbs
from google.appengine.api import datastore
from google.appengine.ext import db
from os.path import isfile
from os.path import join
from os import listdir
def run():
# Set your downloaded folder's path here (must be readable by dev_appserver)
mypath = '/local_target'
# Set your app's name here
appname = "dev~yourappnamehere"
# Do the harlem shake
onlyfiles = [ f for f in listdir(mypath) if isfile(join(mypath,f)) ]
ec = datastore_pbs.get_entity_converter()
for file in onlyfiles:
i = 0
try:
raw = open(mypath + "/" + file, 'r')
reader = records.RecordsReader(raw)
to_put = list()
for record in reader:
entity_proto = entity_pb.EntityProto(contents=record)
entity_proto.key_.app_ = appname
entity = db.model_from_protobuf(entity_proto)
a = db.model_from_protobuf(entity_proto)
for pp in dir(a):
try:
ppp = getattr(a, "_" + pp)
if isinstance(ppp, db.Key):
ppp._Key__reference.set_app(appname)
ppp
except AttributeError:
""" It's okay """
to_put.append(a)
i += 1
if i % 100 == 0:
print "Saved %d %ss" % (i, entity.kind())
db.put(to_put)
to_put = list()
db.put(to_put)
to_put = list()
print "Saved %d" % i
except ProtocolBufferDecodeError:
""" All good """
run()
@OJFord
Copy link

OJFord commented Sep 15, 2014

Thanks for providing this! Let it be known that it also works without change for backups made to (and downloaded from) blobstore - easier and more convenient for very small apps.

@OJFord
Copy link

OJFord commented Sep 19, 2014

This caused an error "devapp may not access sapp's data" for my db.ListProperty( db.Key )'s.

Solution below; see question on StackOverflow.

from google.appengine.ext.db import Key

for e in Model.all():
    if e.keyList:
        prod = e.keyList
        dev  = eval( str(prod).replace('s~','dev~').replace('datastore_types.','') )
        e.keyList = dev
        e.put()

@rotemvil1
Copy link

rotemvil1 commented May 10, 2016

Hey, just saw that "from google.appengine.api.files import records" is deprecated and it isn't working for for a backup from today.
Is there any update for that? Problem is that in the "for record in reader:" never gets in.
Thanks!

@masterpipo
Copy link

is there an update to this script for when using ndb?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment