Skip to content

Instantly share code, notes, and snippets.

@beccasaurus
Last active June 13, 2019 23:57
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save beccasaurus/8ac942988a8f6021a6bf938eb0b6858b to your computer and use it in GitHub Desktop.
Save beccasaurus/8ac942988a8f6021a6bf938eb0b6858b to your computer and use it in GitHub Desktop.
$ include_samples=True (polyfill to support synth + samples with tests)

include_samples=True

SynthTool PR: Add include_samples=True #263


Polyfill to support generating samples with tests w/ SynthTool

This feature is currently in development – this script lets you try it out today!

See also: gapicify-samples polyfill for using the latest sample format

Try this out!

This isn't really for trying out locally, it is for briefly prototyping SynthTool behavior – but you may use it!

  1. Go into the folder of any client library (which uses SynthTool)
    git clone https://github.com/googleapis/google-cloud-python.git
    cd google-cloud-python/
    cd speech/
    
  2. Edit synth.py and update it so it generates samples (only needs to be done once per library)
    # update this:
    # library = gapic.py_library("speech", version, include_protos=True)
    # to this: (sample generation is currently behind this feature flag)
    library = gapic.py_library("speech", version, include_protos=True, generator_args=['--dev_samples'])
    
    # add a directive to copy or move the samples directory
    s.move(library / f"samples/{version}")
  3. (Optional) If you want to generate from a local googleapis, configure local googleapis directory

    You can skip this step if you plan to generate from production googleapis/googleapis

    git clone https://github.com/googleapis/googleapis.git
    cd googleapis/
    export SYNTHTOOL_GOOGLEAPIS=`pwd`
    
  4. Run synth.py
    python3 -m synthtool
    
  5. If the speech_gapic.yaml file for this API contains configured code samples, those should be visible in samples/
    tree samples/
    
     samples/
     ├── v1
     │   ├── speech_transcribe_async.py
     │   ├── speech_transcribe_async_gcs.py
     │   ├── speech_transcribe_async_word_time_offsets_gcs.py
     │   ├── speech_transcribe_enhanced_model.py
     │   ├── speech_transcribe_model_selection.py
     │   ├── speech_transcribe_model_selection_gcs.py
     │   ├── speech_transcribe_multichannel.py
     │   ├── speech_transcribe_multichannel_gcs.py
     │   ├── speech_transcribe_sync.py
     │   └── speech_transcribe_sync_gcs.py
     └── v1p1beta1
        ├── speech_transcribe_auto_punctuation_beta.py
        ├── speech_transcribe_diarization_beta.py
        ├── speech_transcribe_multilanguage_beta.py
        ├── speech_transcribe_recognition_metadata_beta.py
        └── speech_transcribe_word_level_confidence_beta.py
    
  6. Next, to pull in required sample resources and tests & configure the tests to be runnable: Download this script
    curl -LO https://gist.github.com/beccasaurus/8ac942988a8f6021a6bf938eb0b6858b/raw/include_samples.sh
    chmod +x include_samples.sh
    
  7. Edit it. There are a few variables to change at the top of the file.
    ## 
    # Configure these variables:
    ##
    API_NAME=speech
    VERSIONS="v1 v1p1beta1"
    LANGUAGE=python
    
    # Or run with arguments ./include_samples.sh [api name] "[versions]" [language]
  8. Run it.
    ./include_samples.sh
    
  9. If googleapis has *.test.yaml test files and/or a samples_resources.yaml file, you should have this in samples/
    tree samples/
    
     samples/
     ├── resources
     │   ├── brooklyn_bridge.flac
     │   ├── brooklyn_bridge.raw
     │   ├── brooklyn_bridge.wav
     │   ├── commercial_mono.wav
     │   ├── hello.raw
     │   ├── hello.wav
     │   ├── multi.flac
     │   └── multi.wav
     ├── v1
     │   ├── speech_transcribe_async.py
     │   ├── speech_transcribe_async_gcs.py
     │   ├── speech_transcribe_async_word_time_offsets_gcs.py
     │   ├── speech_transcribe_enhanced_model.py
     │   ├── speech_transcribe_model_selection.py
     │   ├── speech_transcribe_model_selection_gcs.py
     │   ├── speech_transcribe_multichannel.py
     │   ├── speech_transcribe_multichannel_gcs.py
     │   ├── speech_transcribe_sync.py
     │   ├── speech_transcribe_sync_gcs.py
     │   └── test
     │       ├── samples.manifest.yaml
     │       ├── speech_transcribe_async.test.yaml
     │       ├── speech_transcribe_async_gcs.test.yaml
     │       ├── speech_transcribe_async_word_time_offsets_gcs.test.yaml
     │       ├── speech_transcribe_enhanced_model.test.yaml
     │       ├── speech_transcribe_model_selection.test.yaml
     │       ├── speech_transcribe_model_selection_gcs.test.yaml
     │       ├── speech_transcribe_multichannel.test.yaml
     │       ├── speech_transcribe_multichannel_gcs.test.yaml
     │       ├── speech_transcribe_sync.test.yaml
     │       └── speech_transcribe_sync_gcs.test.yaml
     └── v1p1beta1
         ├── speech_transcribe_auto_punctuation_beta.py
         ├── speech_transcribe_diarization_beta.py
         ├── speech_transcribe_multilanguage_beta.py
         ├── speech_transcribe_recognition_metadata_beta.py
         ├── speech_transcribe_word_level_confidence_beta.py
         └── test
             ├── samples.manifest.yaml
             ├── speech_transcribe_auto_punctuation_beta.test.yaml
             ├── speech_transcribe_diarization_beta.test.yaml
             ├── speech_transcribe_multilanguage_beta.test.yaml
             ├── speech_transcribe_recognition_metadata_beta.test.yaml
             └── speech_transcribe_word_level_confidence_beta.test.yaml
    
    
  10. Install sample-tester for running generated code sample tests

    Install from this fork/branch

    python3 -m pip install sample-tester
    
  11. Run the tests for v1 and v1p1beta1
sample-tester samples/*/test/*

# or a specific version
sample-tester samples/v1/test/*

# or a specific test
sample-tester --cases "[test name]" samples/v1/test/*

# print out how each sample is run + its output
sample-tester -v detailed samples/v1/test/*

If all went well, you should have a bunch of passing tests :)

 RUNNING: Test environment: "python"
 RUNNING: Test suite: "Transcribe Audio File using Long Running Operation (Local File) (LRO)"
   PASSED: Test case: "speech_transcribe_async (no arguments)"
   PASSED: Test case: "speech_transcribe_async (--local_file_path)"
 RUNNING: Test suite: "Transcript Audio File using Long Running Operation (Cloud Storage) (LRO)"
   PASSED: Test case: "speech_transcribe_async_gcs (no arguments)"
   PASSED: Test case: "speech_transcribe_async_gcs (--storage_uri)"
 RUNNING: Test suite: "Getting word timestamps (Cloud Storage) (LRO)"
   PASSED: Test case: "speech_transcribe_async_word_time_offsets_gcs (no arguments)"
   PASSED: Test case: "speech_transcribe_async_word_time_offsets_gcs (--storage_uri)"
 RUNNING: Test suite: "Using Enhanced Models (Local File)"
   PASSED: Test case: "speech_transcribe_enhanced_model (no arguments)"
   PASSED: Test case: "speech_transcribe_enhanced_model (--local_file_path)"
 RUNNING: Test suite: "Selecting a Transcription Model (Local File)"
   PASSED: Test case: "speech_transcribe_model_selection (no arguments)"
   PASSED: Test case: "speech_transcribe_model_selection (--local_file_path)"
   PASSED: Test case: "speech_transcribe_model_selection (--model)"
   PASSED: Test case: "speech_transcribe_model_selection (invalid --model)"
 RUNNING: Test suite: "Selecting a Transcription Model (Cloud Storage)"
   PASSED: Test case: "speech_transcribe_model_selection_gcs (no arguments)"
   PASSED: Test case: "speech_transcribe_model_selection_gcs (--local_file_path)"
   PASSED: Test case: "speech_transcribe_model_selection_gcs (--model)"
   PASSED: Test case: "speech_transcribe_model_selection_gcs (invalid --model)"
 RUNNING: Test suite: "Multi-Channel Audio Transcription (Local File)"
   PASSED: Test case: "speech_transcribe_multichannel (no arguments)"
   PASSED: Test case: "speech_transcribe_multichannel (--local_file_path)"
 RUNNING: Test suite: "Multi-Channel Audio Transcription (Cloud Storage)"
   PASSED: Test case: "speech_transcribe_multichannel_gcs (no arguments)"
   PASSED: Test case: "speech_transcribe_multichannel_gcs (--storage_uri)"
 RUNNING: Test suite: "Transcribe Audio File (Local File)"
   PASSED: Test case: "speech_transcribe_sync (no arguments)"
   PASSED: Test case: "speech_transcribe_sync (--local_file_path)"
 RUNNING: Test suite: "Transcript Audio File (Cloud Storage)"
   PASSED: Test case: "speech_transcribe_sync_gcs (no arguments)"
   PASSED: Test case: "speech_transcribe_sync_gcs (--storage_uri)"
 RUNNING: Test suite: "Getting punctuation in results (Local File) (Beta)"
   PASSED: Test case: "speech_transcribe_auto_punctuation_beta (no arguments)"
   PASSED: Test case: "speech_transcribe_auto_punctuation_beta (--local_file_path)"
 RUNNING: Test suite: "Separating different speakers (Local File) (LRO) (Beta)"
   PASSED: Test case: "speech_transcribe_diarization_beta (no arguments)"
   PASSED: Test case: "speech_transcribe_diarization_beta (--local_file_path)"
 RUNNING: Test suite: "Detecting language spoken automatically (Local File) (Beta)"
   PASSED: Test case: "speech_transcribe_multilanguage_beta (no arguments)"
   PASSED: Test case: "speech_transcribe_multilanguage_beta (--local_file_path)"
 RUNNING: Test suite: "Adding recognition metadata (Local File) (Beta)"
   PASSED: Test case: "speech_transcribe_recognition_metadata_beta (no arguments)"
   PASSED: Test case: "speech_transcribe_recognition_metadata_beta (--local_file_path)"
 RUNNING: Test suite: "Enabling word-level confidence (Local File) (Beta)"
   PASSED: Test case: "speech_transcribe_word_level_confidence_beta (no arguments)"
   PASSED: Test case: "speech_transcribe_word_level_confidence_beta (--local_file_path)"

Tests passed
  1. Made updates? Changes to the samples or tests in googleapis? Run again. You can even delete samples/ if you want.
    rm -r samples/ && python3 -m synthtool && ./include_samples.sh && sample-tester samples/*/test/*
    

What does include_samples=True do?

When you add include_samples=True to your synth.py, running synthtool should perform the following:

  1. Run artman with the --dev_samples flag

    -> You can do this today via py_library(..., generator_args=['--dev-samples'])

  2. Copy sample tests from googleapis

    -> Based on the current implementation of include_protos=True, the *.test.yaml
    -> files from googleapis/googleapis[-private] should be copied over, probably
    -> into the samples/ directory which artman outputs for each library

  3. Generate code sample per-language manifest file

    -> This relies on the sample-tester PyPi package being available
    -> This generated file has an index of every code samples which was generated
    -> indexed on region tag. It defined 'how to execute' each code sample
    -> e.g. via python3 samples/v1/sample_name.py or mvn:exec -Dsample_name
    -> This is required by the test executor (so that it can execute each sample by invoking it the way a user would)

  4. Copy sample resources from Cloud Storage

    -> Read the sample_resources.yaml file in the root API directory in googleapis.
    -> Contains list of all required resource files including public GCS URI for download and file description.
    -> Download these into samples/resources/ for use by samples and tests.
    -> May also support copying resources/ files directly from googleapis

#! /bin/bash
##
# Configure these variables:
##
# API_NAME=speech
# VERSIONS="v1 v1p1beta1"
# LANGUAGE=python
# Or run with arguments ./include_samples.sh [api name] "[versions]" [language]
main() {
if [ "$#" -eq 3 ]
then
include_samples "$@"
elif [ "$#" -eq 0 ] && [ -n "$API_NAME" ] && [ -n "$VERSIONS" ] && [ -n "$LANGUAGE" ]
then
include_samples $API_NAME "$VERSIONS" $LANGUAGE
else
usage
exit 1
fi
}
usage() {
echo "Usage: ./include_samples.sh [api name] \"[version] [version]\" [language]"
echo
echo "You can also edit include_samples.sh to set configuration variables"
echo
echo "Or you can provide environment variables, e.g."
echo " export API_NAME=speech"
echo ' export VERSIONS="v1 v1p1beta1"'
echo " export LANGUAGE=python"
}
include_samples() {
local api_name="$1"
local versions="$2"
local language="$3"
for version in $versions
do
# copy example files used by samples and tests into samples/resources/
copy_sample_resources $api_name $version
# copy test files (*.test.yaml) into samples/[version]/test/
copy_test_files $api_name $version
# generate yaml file which declared how to invoke each sample
# (used by the test runner to invoke samples in the tests)
generate_sample_exec_manifest $api_name $version $language
done
}
copy_sample_resources() {
local api_name="$1"
local version="$2"
local destination_directory=samples/resources
# resource files, if any, are declared in sample_resources.yaml in googleapis
# they are shared across all versions of the API for reusability
local sample_resources_yaml="$(googleapis_api_root $api_name)/sample_resources.yaml"
if [ -f "$sample_resources_yaml" ]
then
local gcs_resource_uris="$( grep uri "$sample_resources_yaml" | awk '{ print $NF }' )"
mkdir -p $destination_directory
for uri in $gcs_resource_uris
do
local remote_filename="$( basename $uri )"
if [ ! -f "$destination_directory/$remote_filename" ]
then
gsutil cp $uri $destination_directory
fi
done
fi
}
copy_test_files() {
local api_name="$1"
local version="$2"
local destination_directory=samples/$version/test/
local source_directory="$(googleapis_api_root $api_name)/$version/samples"
for test_yaml in `find $source_directory -type f -name "*.test.yaml"`
do
mkdir -p $destination_directory
cp -v $test_yaml $destination_directory
done
}
generate_sample_exec_manifest() {
local api_name="$1"
local version="$2"
local language="$3"
# When possible, run from samples/ because resources/ will be available there
case $language in
nodejs)
sample-tester gen-manifest \
--env nodejs \
--bin node \
--chdir samples/ \
--output "samples/$version/test/samples.manifest.yaml" \
samples/$version/*.js
;;
php)
sample-tester gen-manifest \
--env php \
--bin php \
--chdir samples/ \
--output "samples/$version/test/samples.manifest.yaml" \
samples/$version/*.php
;;
python)
sample-tester gen-manifest \
--env python \
--bin python3 \
--chdir samples/ \
--output "samples/$version/test/samples.manifest.yaml" \
samples/$version/*.py
;;
ruby)
sample-tester gen-manifest \
--env ruby \
--bin "bundle exec ruby" \
--chdir samples/ \
--output "samples/$version/test/samples.manifest.yaml" \
samples/$version/*.rb
;;
*)
echo "Haven't added the sample-tester command for $language yet."
;;
esac
}
# This assumed cloud/
googleapis_api_root() {
local api_name="$1"
find "$(googleapis_root)/google" -type d -name $api_name
}
googleapis_root() {
local googleapis_root=""
if [ -n "$SYNTHTOOL_GOOGLEAPIS" ]; then
googleapis_root="$SYNTHTOOL_GOOGLEAPIS"
echo "Using googleapis @ $googleapis_root from SYNTHTOOL_GOOGLEAPIS" 1>&2
else
if grep private=True synth.py &>/dev/null; then
googleapis_root="$HOME/.cache/synthtool/googleapis-private"
echo "Using googleapis-private @ $googleapis_root" 1>&2
else
googleapis_root="$HOME/.cache/synthtool/googleapis"
echo "Using googleapis @ $googleapis_root" 1>&2
fi
fi
echo $googleapis_root
}
main "$@"
# Copyright 2019 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment