SynthTool PR: Add include_samples=True #263
Polyfill to support generating samples with tests w/ SynthTool
This feature is currently in development – this script lets you try it out today!
See also: gapicify-samples polyfill for using the latest sample format
This isn't really for trying out locally, it is for briefly prototyping SynthTool behavior – but you may use it!
- Go into the folder of any client library (which uses SynthTool)
git clone https://github.com/googleapis/google-cloud-python.git cd google-cloud-python/ cd speech/
- Edit synth.py and update it so it generates samples (only needs to be done once per library)
# update this: # library = gapic.py_library("speech", version, include_protos=True) # to this: (sample generation is currently behind this feature flag) library = gapic.py_library("speech", version, include_protos=True, generator_args=['--dev_samples']) # add a directive to copy or move the samples directory s.move(library / f"samples/{version}")
- (Optional) If you want to generate from a local googleapis, configure local googleapis directory
You can skip this step if you plan to generate from production googleapis/googleapis
git clone https://github.com/googleapis/googleapis.git cd googleapis/ export SYNTHTOOL_GOOGLEAPIS=`pwd`
- Run synth.py
python3 -m synthtool
- If the
speech_gapic.yaml
file for this API contains configured code samples, those should be visible insamples/
tree samples/ samples/ ├── v1 │ ├── speech_transcribe_async.py │ ├── speech_transcribe_async_gcs.py │ ├── speech_transcribe_async_word_time_offsets_gcs.py │ ├── speech_transcribe_enhanced_model.py │ ├── speech_transcribe_model_selection.py │ ├── speech_transcribe_model_selection_gcs.py │ ├── speech_transcribe_multichannel.py │ ├── speech_transcribe_multichannel_gcs.py │ ├── speech_transcribe_sync.py │ └── speech_transcribe_sync_gcs.py └── v1p1beta1 ├── speech_transcribe_auto_punctuation_beta.py ├── speech_transcribe_diarization_beta.py ├── speech_transcribe_multilanguage_beta.py ├── speech_transcribe_recognition_metadata_beta.py └── speech_transcribe_word_level_confidence_beta.py
- Next, to pull in required sample resources and tests & configure the tests to be runnable: Download this script
curl -LO https://gist.github.com/beccasaurus/8ac942988a8f6021a6bf938eb0b6858b/raw/include_samples.sh chmod +x include_samples.sh
- Edit it. There are a few variables to change at the top of the file.
## # Configure these variables: ## API_NAME=speech VERSIONS="v1 v1p1beta1" LANGUAGE=python # Or run with arguments ./include_samples.sh [api name] "[versions]" [language]
- Run it.
./include_samples.sh
- If googleapis has
*.test.yaml
test files and/or asamples_resources.yaml
file, you should have this insamples/
tree samples/ samples/ ├── resources │ ├── brooklyn_bridge.flac │ ├── brooklyn_bridge.raw │ ├── brooklyn_bridge.wav │ ├── commercial_mono.wav │ ├── hello.raw │ ├── hello.wav │ ├── multi.flac │ └── multi.wav ├── v1 │ ├── speech_transcribe_async.py │ ├── speech_transcribe_async_gcs.py │ ├── speech_transcribe_async_word_time_offsets_gcs.py │ ├── speech_transcribe_enhanced_model.py │ ├── speech_transcribe_model_selection.py │ ├── speech_transcribe_model_selection_gcs.py │ ├── speech_transcribe_multichannel.py │ ├── speech_transcribe_multichannel_gcs.py │ ├── speech_transcribe_sync.py │ ├── speech_transcribe_sync_gcs.py │ └── test │ ├── samples.manifest.yaml │ ├── speech_transcribe_async.test.yaml │ ├── speech_transcribe_async_gcs.test.yaml │ ├── speech_transcribe_async_word_time_offsets_gcs.test.yaml │ ├── speech_transcribe_enhanced_model.test.yaml │ ├── speech_transcribe_model_selection.test.yaml │ ├── speech_transcribe_model_selection_gcs.test.yaml │ ├── speech_transcribe_multichannel.test.yaml │ ├── speech_transcribe_multichannel_gcs.test.yaml │ ├── speech_transcribe_sync.test.yaml │ └── speech_transcribe_sync_gcs.test.yaml └── v1p1beta1 ├── speech_transcribe_auto_punctuation_beta.py ├── speech_transcribe_diarization_beta.py ├── speech_transcribe_multilanguage_beta.py ├── speech_transcribe_recognition_metadata_beta.py ├── speech_transcribe_word_level_confidence_beta.py └── test ├── samples.manifest.yaml ├── speech_transcribe_auto_punctuation_beta.test.yaml ├── speech_transcribe_diarization_beta.test.yaml ├── speech_transcribe_multilanguage_beta.test.yaml ├── speech_transcribe_recognition_metadata_beta.test.yaml └── speech_transcribe_word_level_confidence_beta.test.yaml
- Install sample-tester for running generated code sample tests
Install from this fork/branch
python3 -m pip install sample-tester
- Run the tests for v1 and v1p1beta1
sample-tester samples/*/test/*
# or a specific version
sample-tester samples/v1/test/*
# or a specific test
sample-tester --cases "[test name]" samples/v1/test/*
# print out how each sample is run + its output
sample-tester -v detailed samples/v1/test/*
If all went well, you should have a bunch of passing tests :)
RUNNING: Test environment: "python"
RUNNING: Test suite: "Transcribe Audio File using Long Running Operation (Local File) (LRO)"
PASSED: Test case: "speech_transcribe_async (no arguments)"
PASSED: Test case: "speech_transcribe_async (--local_file_path)"
RUNNING: Test suite: "Transcript Audio File using Long Running Operation (Cloud Storage) (LRO)"
PASSED: Test case: "speech_transcribe_async_gcs (no arguments)"
PASSED: Test case: "speech_transcribe_async_gcs (--storage_uri)"
RUNNING: Test suite: "Getting word timestamps (Cloud Storage) (LRO)"
PASSED: Test case: "speech_transcribe_async_word_time_offsets_gcs (no arguments)"
PASSED: Test case: "speech_transcribe_async_word_time_offsets_gcs (--storage_uri)"
RUNNING: Test suite: "Using Enhanced Models (Local File)"
PASSED: Test case: "speech_transcribe_enhanced_model (no arguments)"
PASSED: Test case: "speech_transcribe_enhanced_model (--local_file_path)"
RUNNING: Test suite: "Selecting a Transcription Model (Local File)"
PASSED: Test case: "speech_transcribe_model_selection (no arguments)"
PASSED: Test case: "speech_transcribe_model_selection (--local_file_path)"
PASSED: Test case: "speech_transcribe_model_selection (--model)"
PASSED: Test case: "speech_transcribe_model_selection (invalid --model)"
RUNNING: Test suite: "Selecting a Transcription Model (Cloud Storage)"
PASSED: Test case: "speech_transcribe_model_selection_gcs (no arguments)"
PASSED: Test case: "speech_transcribe_model_selection_gcs (--local_file_path)"
PASSED: Test case: "speech_transcribe_model_selection_gcs (--model)"
PASSED: Test case: "speech_transcribe_model_selection_gcs (invalid --model)"
RUNNING: Test suite: "Multi-Channel Audio Transcription (Local File)"
PASSED: Test case: "speech_transcribe_multichannel (no arguments)"
PASSED: Test case: "speech_transcribe_multichannel (--local_file_path)"
RUNNING: Test suite: "Multi-Channel Audio Transcription (Cloud Storage)"
PASSED: Test case: "speech_transcribe_multichannel_gcs (no arguments)"
PASSED: Test case: "speech_transcribe_multichannel_gcs (--storage_uri)"
RUNNING: Test suite: "Transcribe Audio File (Local File)"
PASSED: Test case: "speech_transcribe_sync (no arguments)"
PASSED: Test case: "speech_transcribe_sync (--local_file_path)"
RUNNING: Test suite: "Transcript Audio File (Cloud Storage)"
PASSED: Test case: "speech_transcribe_sync_gcs (no arguments)"
PASSED: Test case: "speech_transcribe_sync_gcs (--storage_uri)"
RUNNING: Test suite: "Getting punctuation in results (Local File) (Beta)"
PASSED: Test case: "speech_transcribe_auto_punctuation_beta (no arguments)"
PASSED: Test case: "speech_transcribe_auto_punctuation_beta (--local_file_path)"
RUNNING: Test suite: "Separating different speakers (Local File) (LRO) (Beta)"
PASSED: Test case: "speech_transcribe_diarization_beta (no arguments)"
PASSED: Test case: "speech_transcribe_diarization_beta (--local_file_path)"
RUNNING: Test suite: "Detecting language spoken automatically (Local File) (Beta)"
PASSED: Test case: "speech_transcribe_multilanguage_beta (no arguments)"
PASSED: Test case: "speech_transcribe_multilanguage_beta (--local_file_path)"
RUNNING: Test suite: "Adding recognition metadata (Local File) (Beta)"
PASSED: Test case: "speech_transcribe_recognition_metadata_beta (no arguments)"
PASSED: Test case: "speech_transcribe_recognition_metadata_beta (--local_file_path)"
RUNNING: Test suite: "Enabling word-level confidence (Local File) (Beta)"
PASSED: Test case: "speech_transcribe_word_level_confidence_beta (no arguments)"
PASSED: Test case: "speech_transcribe_word_level_confidence_beta (--local_file_path)"
Tests passed
- Made updates? Changes to the samples or tests in googleapis? Run again. You can even delete samples/ if you want.
rm -r samples/ && python3 -m synthtool && ./include_samples.sh && sample-tester samples/*/test/*
When you add include_samples=True
to your synth.py, running synthtool should perform the following:
-
Run artman with the
--dev_samples
flag-> You can do this today via
py_library(..., generator_args=['--dev-samples'])
-
Copy sample tests from googleapis
-> Based on the current implementation of
include_protos=True
, the*.test.yaml
-> files fromgoogleapis/googleapis[-private]
should be copied over, probably
-> into thesamples/
directory which artman outputs for each library -
Generate code sample per-language manifest file
-> This relies on the
sample-tester
PyPi package being available
-> This generated file has an index of every code samples which was generated
-> indexed on region tag. It defined 'how to execute' each code sample
-> e.g. viapython3 samples/v1/sample_name.py
ormvn:exec -Dsample_name
-> This is required by the test executor (so that it can execute each sample by invoking it the way a user would) -
Copy sample resources from Cloud Storage
-> Read the
sample_resources.yaml
file in the root API directory in googleapis.
-> Contains list of all required resource files including public GCS URI for download and file description.
-> Download these intosamples/resources/
for use by samples and tests.
-> May also support copying resources/ files directly from googleapis