leoricklin/res_gcp.md

## res_gcp.md

      
    Raw
  

              res_gcp.md
            
          
    1.Google Cloud security best practices center

2.Solutions

2.1.Industry solutions

2.2.Application modernization

2.3.Artificial intelligence

2.4.APIs and applications

2.5.Databases

2.6.Data cloud

2.7 Digital transformation

2.8 Infrastructure modernization

2.9 Productivity and collaboration

2.10 Security

2.11 Smart analytics reference patterns

Anomaly detection

Data monetization

Environmental, social, and governance

General analytics

Health care and life sciences

Log analytics

Pattern recognition

Predictive forecasting

Real-time clickstream analytics

Time series analytics

Working with data lakes

2.12.Startups and SMB

2.13.Featured partner solutions

3.Products

SOLUTION ICONS FOR ARCHITECTURAL DIAGRAMS

The official set of icons to build architectural diagrams of Google Cloud Platform
[GPU[(https://cloud.google.com/gpu)


GPUs on Compute Engine
GPUs on Google Kubernetes Engine
Attaching GPUs to Dataproc clusters

20201005 Accelerating Spark 3.0 and XGBoost End-to-End Training and Hyperparameter Tuning

4.Tutorials

Getting started

Big Data and analysis

Machine learning

Architectural diagrams

7.Event,

7.1.Cloud OnAir, https://cloudonair.withgoogle.com/

Tune in every week to hear from our experts and partners who will help you discover how to get the most from Google Cloud.
Applied ML Summit 202106, https://cloudonair.withgoogle.com/events/summit-ml-practitioners

The cloud developer’s guide to Google I/O 202105, https://cloud.google.com/blog/topics/developers-practitioners/cloud-developers-guide-google-io-2021

Cloud data Summit 202105, https://cloudonair.withgoogle.com/events/summit-data-cloud

Google Next 20, https://cloud.google.com/blog/topics/google-cloud-next/getting-around-google-cloud-next20-onair

20200324 Cloud OnBoard: GCP Fundamentals Series (March 24 - 26, 2020)

Cloud OnBoard is a free online instructor-led training program that enables developers and IT professionals to expand their skill set into the cloud. Google Cloud Platform (GCP) Fundamentals Series brings the Google Cloud Community together for three consecutive days of interactive learning and hands-on labs. Choose one, two, or all three online half-day programs and take your skills to new heights:

Core Infrastructure
Big Data & Machine Learning
Application Development with GCP

8.Resources


Google Cloud Tech Youtube
Google Cloud Tech at Google I/O 2021, https://youtube.com/playlist?list=PLIivdWyY5sqL-MezjPxyWdB5L7iLNroUM

Developer Keynote (Google I/O '21) - American Sign Language
Go full-stack with Kotlin or Dart on Google Cloud | Session
Serverless demo derby | Session
AI/ML demo derby | Session
Build end-to-end solutions with Vertex AI | Session

Introducing Vertex AI
Streamlined model development
Productionizing models with pipelines
Monitoring deployed models


AI in the Cloud | Q&A
Secure and reliable development with Go | Q&A
Build voice bots for mobile with Dialogflow and Flutter | Workshop
Fuel your custom models on the Cloud AI Platform | Demo
ML Ops on Google Cloud | Demo
Strike a pose: Training a vision model on the AI Platform | Demo


9.Google Code,

[Codelabs](https://codelabs.developers.google.com/]

Google Developers Codelabs provide a guided, tutorial, hands-on coding experience. Most codelabs will step you through the process of building a small application, or adding a new feature to an existing application. They cover a wide range of topics such as Android Wear, Google Compute Engine, Project Tango, and Google APIs on iOS.
Colaboratory

是一个免费的 Jupyter 笔记本环境，不需要进行任何设置就可以使用，并且完全在云端运行。借助 Colaboratory，您可以编写和执行代码、保存和共享分析结果，以及利用强大的计算资源，所有这些都可通过浏览器免费使用。

  
## res_gcp_0_architecture.md

      
    Raw
  

              res_gcp_0_architecture.md
            
          
    1.Architecture Center

Architecture framework

The framework was created by seasoned experts at Google Cloud, including customer engineers, solution architects, cloud reliability engineers, and members of the professional service organization. It consists of the following series of articles:

Overview (this article)
Google Cloud system design considerations
Operational excellence
Security, privacy, and compliance
Reliability
Performance and cost optimization

Each principle section provides details on strategies, best practices, design questions, recommendations, key Google Cloud services, and links to resources.
1.1.AI and machine learning resources

Best practices for implementing machine learning on Google Cloud, EN


Use recommended tools and products
The following table lists recommended tools and products for each phase of the ML workflow as outlined in this document:


Machine learning workflow step
Recommended tools and products


ML environment setup
Notebooks, Vertex SDK for Python


ML development
BigQuery, Cloud Storage, Notebooks, Vertex Data Labeling, Vertex Explainable AI, Vertex Feature Store, Vertex TensorBoard, Vertex Training


Data processing
BigQuery, Dataflow, Dataproc, Managed datasets, TensorFlow Extended


Operationalized training
Cloud Storage, PyTorch, TensorFlow Core, Vertex Feature Store, Vertex Pipelines, Vertex Training


Model deployment and serving
Vertex Prediction, ML workflow orchestration, Kubeflow Pipelines, TensorFlow Extended, Vertex Pipelines


Artifact organization
Artifact Registry


Model monitoring
Vertex Explainable AI, Vertex Model Monitoring


Architecture for MLOps using TFX, Kubeflow Pipelines, and Cloud Build, EN


MLOps


CI/CD pipeline compared to CT pipeline
The availability of new data is one trigger to retrain the ML model. The availability of a new implementation of the ML pipeline (including new model architecture, feature engineering, and hyperparameters) is another important trigger to re-execute the ML pipeline. This new implementation of the ML pipeline serves as a new version of the model prediction service, for example, a microservice with a REST API for online serving. The difference between the two cases is as follows:


To train a new ML model with new data, the previously deployed CT pipeline is executed. No new pipelines or components are deployed; only a new prediction service or newly trained model is served at the end of the pipeline.


To train a new ML model with a new implementation, a new pipeline is deployed through a CI/CD pipeline.
The following diagram shows the relationship between the CI/CD pipeline and the ML CT pipeline.

Figure 1. CI/CD and ML CT pipelines.


Designing a TFX-based ML system


Orchestrating the ML system using Kubeflow Pipelines


Setting up CI/CD for ML on Google Cloud

Figure 6: High-level overview of CI/CD for Kubeflow pipelines.


Minimizing real-time prediction serving latency in machine learning

An ML model is useful only if it's deployed and ready to make predictions, but building an adapted ML serving system requires the following:

Knowing whether you need to provide predictions in real time or offline.
Balancing between model predictive performance and prediction latency.
Managing the input features required by the model in a low read-latency lookup store.
Knowing whether at least some of the predictions can be precomputed offline to be served online.

The article addresses these points in detail. It assumes that you're familiar with BigQuery, Dataflow, Cloud Storage, and AI Platform.
MLOps: Continuous delivery and automation pipelines in machine learning, EN


MLOps level 0: Manual process

  
MLOps level 1: ML pipeline automation

  
MLOps level 2: CI/CD pipeline automation

  
Figure 5. Stages of the CI/CD automated ML pipeline.
1.2.Application development

Service meshes in a microservices architecture

The service mesh control plane enables the proxies to perform the following functions:

Service discovery
Service routing
Load balancing
Authentication and authorization
Observability

1.3.Big data & analytics

Using Apache Spark DStreams with Dataproc and Pub/Sub


Building a streaming video analytics pipeline

  
Data lineage systems for a data warehouse

Reference architecture
This section describes a conceptual architecture for building a passive data lineage system for a SQL-like data warehouse. You can implement the architecture in different ways.

Migrating data warehouses to BigQuery

Migration to Google Cloud: Transferring your large datasets


Introduction and overview

The following diagram shows a high-level legacy architecture before the migration. It illustrates the catalog of available data sources, legacy data pipelines, legacy operational pipelines and feedback loops, and legacy BI reports and dashboards that are accessed by your end users.

<img src="https://cloud.google.com/architecture/dw2bq/images/dw-bq-migration-overview-architecture-before-migration.svg" style="display:block;margin-left:0;width:800px;"/>

During migration, you run both your legacy data warehouse and BigQuery, as detailed in this document. The reference architecture in the following diagram highlights that both data warehouses offer similar functionality and paths—both can ingest from the source systems, integrate with the business applications, and provide the required user access. Importantly, the diagram also highlights that data is synchronized from your data warehouse to BigQuery. This allows use cases to be offloaded during the entire duration of the migration effort.

<img src="https://cloud.google.com/architecture/dw2bq/images/dw-bq-migration-overview-architecture-during-migration.svg" style="display:block;margin-left:0;width:800px;"/>


Schema and data transfer overview
Data governance
Data pipelines
Reporting and analysis
Performance optimization

2.Resourcs

20210408 Building real-time data pipelines for capital markets firms | Google Cloud Blog

   
20200504 13 Most Common Google Cloud Reference Architectures | by Priyanka Vergadia | Google Cloud - Community | Medium, EN


13daysofGCP, https://twitter.com/hashtag/13DaysofGCP

Google Cloud Solutions Architecture Reference


## res_gcp_0_certification.md

      
    Raw
  

              res_gcp_0_certification.md
            
          
    Google Developers Certification

1.TensorFlow Developer Certificate

2.Google Cloud Certified - Professional Data Engineer

3.Cloud Architect

Overview

The Professional Cloud Architect certification exam assesses your ability to:

Design and plan a cloud solution architecture
Manage and provision the cloud solution infrastructure
Design for security and compliance
Analyze and optimize technical and business processes
Manage implementations of cloud architecture
Ensure solution and operations reliability

3.1.Exam Guide

1.Designing and planning a cloud solution architecture
1.1.Designing a solution infrastructure that meets business requirements. Considerations include:


First
Second


Business use cases and product strategy
-


Cost optimization
-


Supporting the application design
-


Integration with external systems
-


Movement of data
-


Design decision trade-offs
-


Build, buy, modify, or deprecate
-


Success measurements (e.g., key performance indicators [KPI], return on investment [ROI], metrics)
-


Compliance and observability
-


1.2.Designing a solution infrastructure that meets technical requirements. Considerations include:


First
Second


High availability and failover design
-


Elasticity of cloud resources with respect to quotas and limits
-


Scalability to meet growth requirements
-


Performance and latency
-


1.3.Designing network, storage, and compute resources. Considerations include:


First
Second


Integration with on-premises/multi-cloud environments
-


Cloud-native networking (VPC, peering, firewalls, container networking)
-


Choosing data processing technologies
-


Choosing appropriate storage types (e.g., object, file, databases)
-


Choosing compute resources (e.g., preemptible, custom machine type, specialized workload)
-


Mapping compute needs to platform products
-


1.4.Creating a migration plan (i.e., documents and architectural diagrams). Considerations include:


First
Second


Integrating solutions with existing systems
-


Migrating systems and data to support the solution
-


Software license mapping
-


Network planning
-


Testing and proofs of concept
-


Dependency management planning
-


1.5.Envisioning future solution improvements. Considerations include:


First
Second


Cloud and technology improvements
-


Evolution of business needs
-


Evangelism and advocacy
-


2.Managing and provisioning a solution infrastructure
2.1.Configuring network topologies. Considerations include:


First
Second


Extending to on-premises environments (hybrid networking)
-


Extending to a multi-cloud environment that may include Google Cloud to Google Cloud communication
-


Security protection (e.g. intrusion protection, access control, firewalls)
-


2.2.Configuring individual storage systems. Considerations include:


First
Second


Data storage allocation
-


Data processing/compute provisioning
-


Security and access management
-


Network configuration for data transfer and latency
-


Data retention and data life cycle management
-


Data growth planning
-


2.3.Configuring compute systems. Considerations include:


First
Second


Compute resource provisioning
-


Compute volatility configuration (preemptible vs. standard)
-


Network configuration for compute resources (Google Compute Engine, Google Kubernetes Engine, serverless networking)
-


Infrastructure orchestration, resource configuration, and patch management
-


Container orchestration
-


3.Designing for security and compliance
3.1.Designing for security. Considerations include:


First
Second


Identity and access management (IAM)
-


Resource hierarchy (organizations, folders, projects)
-


Data security (key management, encryption, secret management)
-


Separation of duties (SoD)
-


Security controls (e.g., auditing, VPC Service Controls, context aware access, organization policy)
-


Managing customer-managed encryption keys with Cloud Key Management Service
-


Remote access
-


3.2.Designing for compliance. Considerations include:


First
Second


Legislation (e.g., health record privacy, children’s privacy, data privacy, and ownership)
-


Commercial (e.g., sensitive data such as credit card information handling, personally identifiable information [PII])
-


Industry certifications (e.g., SOC 2)
-


Audits (including logs)
-


4.Analyzing and optimizing technical and business processes
4.1.Analyzing and defining technical processes. Considerations include:


First
Second


Software development life cycle (SDLC)
-


Continuous integration / continuous deployment
-


Troubleshooting / root cause analysis best practices
-


Testing and validation of software and infrastructure
-


Service catalog and provisioning
-


Business continuity and disaster recovery
-


4.2.Analyzing and defining business processes. Considerations include:


First
Second


Stakeholder management (e.g. influencing and facilitation)
-


Change management
-


Team assessment / skills readiness
-


Decision-making processes
-


Customer success management
-


Cost optimization / resource optimization (capex / opex)
-


4.3.Developing procedures to ensure reliability of solutions in production (e.g., chaos engineering, penetration testing)
5.Managing implementation
5.1.Advising development/operation team(s) to ensure successful deployment of the solution. Considerations include:


First
Second


Application development
-


API best practices
-


Testing frameworks (load/unit/integration)
-


Data and system migration and management tooling
-


5.2.Interacting with Google Cloud programmatically. Considerations include:


First
Second


Google Cloud Shell
-


Google Cloud SDK (gcloud, gsutil and bq)
-


Cloud Emulators (e.g. Cloud Bigtable, Datastore, Spanner, Pub/Sub, Firestore)
-


6.Ensuring solution and operations reliability
6.1.Monitoring/logging/profiling/alerting solution
6.2.Deployment and release management
6.3.Assisting with the support of deployed solutions
6.4.Evaluating quality control measures
Practice test

Cloud Architect learning path

Resources


https://medium.com/@imrahul180/preparing-google-cloud-professional-cloud-architect-66a390deb1af
GCP Architect, https://github.com/leoricklin/GCP_memo/tree/master/cloud_architect
Udemy: Ultimate Google Certified Professional Cloud Architect 2020, https://www.udemy.com/course/google-cloud-architect-certifications/

NT 351, 23 hours on-demand video, 21 articles


Coursera

Data Engineering with Google Cloud, https://www.coursera.org/professional-certificates/gcp-data-engineering

Google Cloud Platform Big Data and Machine Learning Fundamentals

Modernizing Data Lakes and Data Warehouses with GCP

Building Batch Data Pipelines on GCP

Building Resilient Streaming Analytics Systems on GCP

Smart Analytics, Machine Learning, and AI on GCP

Preparing for the Google Cloud Professional Data Engineer Exam

Cloud Architecture with Google Cloud, https://www.coursera.org/professional-certificates/gcp-cloud-architect

1. Google Cloud Platform Fundamentals: Core Infrastructure


01.Introducing Google Cloud Platform
02.Getting Started with Google Cloud Platform
03.Virtual Machines in the Cloud
04.Storage in the Cloud
05.Containers in the Cloud
06.Applications in the Cloud
07.Developing, Deploying and Monitoring in the Cloud
08.Big Data and Machine Learning in the Cloud

2. Essential Google Cloud Infrastructure: Foundation


01.Introduction to GCP
02.Virtual Networks
03.Virtual Machines

3. Essential Google Cloud Infrastructure: Core Services


01.Cloud IAM
02.Storage and Database Services
03.Resource Management
04.Resource Monitoring

4. Elastic Google Cloud Infrastructure: Scaling and Automation


01.Interconnecting Networks
02.Load Balancing and Autoscaling
03.Infrastructure Automation
04.Managed Services

5. Reliable Google Cloud Infrastructure: Design and Process


00.Introduction:  Architecting systems is a matter of weighing the pros and cons of various solutions and trying to find the best solution given your requirements and constraints. 
01.Defining Services
02.Microservice Design and Architecture
03.DevOps Automation
04.Choosing Storage Solutions
05.Google Cloud and Hybrid Network Architecture
06.Deploying Applications to Google Cloud
07.Designing Reliable Systems
08.Security
09.Maintenance and Monitoring

6. Architecting with Google Kubernetes Engine: Foundations


01.Introduction to Google Cloud
02.Introduction to Containers and Kubernetes
03.Kubernetes Architecture

7. Preparing for the Google Cloud Professional Cloud Architect Exam


01.Welcome to Preparing for the Professional Cloud Architect Exam
02.Sample Case Studies
03.Designing and Implementing
04.Optimizing and Operating
05.Resources and next steps


## res_gcp_0_training.md

      
    Raw
  

              res_gcp_0_training.md
            
          
    6.Training,

6.1 Google Cloud Skills Boost

Choose your path, build your skills, and validate your knowledge. All in one place. Register here before November 6th to claim your one month free training offer.

Introduction to MLOps Fundamentals
Build and Deploy Machine Learning Solutions on Vertex AI
Build and Optimize Data Warehouses with BigQuery
Data Catalog Fundamentals

6.2 Google Developer Training

Choose from end-to-end training created by the Google Developers Training team, materials and tutorials for self-study, online courses and Nanodegrees through Udacity, and more. And when you're ready, you can take a Google Developers Certification exam to gain recognition for your development skills.
6.3 Google's fast-paced, practical introduction to machine learning

ML Concepts


Introduction to ML (3 min)
Framing (15 min)
Descending into ML (20 min)
Reducing Loss (60 min)
First Steps with TF (65 min)
Generalization (15 min)
Training and Test Sets (25 min)
Validation Set (35 min)
Representation (35 min)
Feature Crosses (70 min)
Regularization: Simplicity (40 min)
Logistic Regression (20 min)
Classification (90 min)
Regularization: Sparsity (20 min)
Neural Networks (65 min)
Training Neural Nets (10 min)
Multi-Class Neural Nets (45 min)
Embeddings (50 min)

ML Engineering


Production ML Systems (3 min)
Static vs. Dynamic Training (7 min)
Static vs. Dynamic Inference (7 min)
\Data Dependencies (14 min)
Fairness (70 min)

ML Systems in the Real World


Cancer Prediction (5 min)
Literature (5 min)
Guidelines (2 min)

6.4 Learning Google Cloud Topics

AI and ML

App modernization

Cloud basics

Data analytics

What is Apache Hadoop?

Learn the basics of Apache Hadoop, including what it is, how it’s used, and what advantages it brings to big data environments.
What is Apache Kafka?

Learn about Apache Kafka, a platform for collecting, processing, and storing streaming data.
What is Apache Spark?

Learn about Apache Spark, an analytics engine for large-scale data processing.
What is big data?

Learn about big data with an overview, characteristics, and examples.
What is business intelligence?

Learn about Business intelligence (BI), the process of analyzing company data to improve operations.
What is data governance?

Data governance defined

Data governance is everything you do to ensure data is secure, private, accurate, available, and usable. It includes the actions people must take, the processes they must follow, and the technology that supports them throughout the data life cycle.

What is data integration?

Learn about data integration—the process of unifying data from different sources into a more useful view.
What is a data lake?

Learn how data lakes store, process, and secure large amounts of data.
What is a data warehouse?

Learn about data warehouses (DW), which are systems for data analysis and reporting.
What is ETL?

Learn how ETL lets companies convert structured and unstructured data to drive business decisions.
What is predictive analytics?

Learn how predictive analytics uses data, statistics, modeling, and machine learning to help predict and plan for future events, or find opportunities.
What is Presto?

Learn how Presto, an open source distributed SQL query engine created by Facebook developers, runs interactive analytics against large volumes of data.
What is streaming analytics?

Learn about streaming analytics, which processes and analyzes data from sources that continuously send data.
What is time series?

Learn how to model historical time-series data in order to make predictions about future time points and common use cases.
Databases

Infrastructure

Security

Storage

Take the next step

6.5 Google and Udacity teamed up to create free, online courses

Intro to TensorFlow for Deep Learning

6.6.FREE (Google) Courses you will regret not taking in 2023


[1] Google Analytics for Beginners
[2] Advanced Google Analytics
[3] Google Analytics for Power Users
[4] Fundamentals of digital marketing
[5] Get started with Google Maps Platform
[6] Google Cloud Computing Foundations:
[7] Google Cloud Computing Foundations:
[8] Google Cloud Computing Foundations: Data, ML, and AI in Google Cloud
[9] Google Cloud Computing Foundations: Networking and Security in Google Cloud
[10] Machine Learning Crash Course
[11] Basics of Machine Learning
[12] Data Science with Python
[13] Python Basics for Data Analysis
[14] Data Science Foundations

Cloud Technical Series-2023

Day 1 | Google Cloud Fundamentals and Data Cloud & Analytics, 3月28日

Welcome & Introduction, Russell Nash

Get started with Google Cloud, Vamsi Ramakrishnan

The new dynamics of Infrastructure - Workload Optimisation, Gustavo Fuchs

Customer Spotlight - GoPay's transformation with Google Cloud, Gaurav Anand, Giovanni Sakti Nugraha

Where should I run my stuff, Shikha Saxena

Google Developer Expert Panel - Top tips for building in the Cloud, Thu Ya Kyaw, Rendy B Junior, Rushabh Vasa, Jean-Klaas Gunnink

Developer Community Session - My experience as a woman in tech, Chanel Greco

Quiz Results & Break, Priyanka Vergadia, Russell Nash

Unleashing the Power of Cloud Analytics with Looker - From Data to Insights, Sudipto Guha

5 Reasons to use BigQuery as the Heart of your Data Analytics Platform, Russell Nash

Spotlight on BigQuery - Featuring Inshorts & MongoDB, Russell Nash, Rohan Walia, Saurabh Jain

Event-driven Architecture and Analytics, Prasanna Keny

Build HTAP and AI-powered applications with AlloyDB, Subhash Guddad

Ask the Experts, Priyanka Vergadia, Russell Nash, Sudipto Guha, Subhash Guddad

Quiz Results & Close, Priyanka Vergadia, Russell Nash

Day 2 | Applied Machine Learning / AI and Application Modernisation, 3月29日

Welcome & Introduction, Priyanka Vergadia, Russell Nash

Generative AI without a Phd, Erwin Huizenga

AI-UX - Transforming the user experience with Vertex AI and Discovery, Kaz Sato

Spotlight on Vertex AI - featuring Neo4j, Ezhil Vendhan

Computer Vision with Google Cloud, Priyanka Vergadia

Spotlight on Vision AI - Featuring 99.co, Ye Lin Aung

Certifications & Learning Paths - Featuring AirAsia & Tata Consultancy Services (TCS), Priyanka Vergadia, Erwin Huizenga, Pablo Sanz Salcedo, Mukul Sharma

Quiz Results & Break, Priyanka Vergadia, Russell Nash

Secure Software Supply Chain - Shift Left Security like Google with SLSA and Software Delivery Shield, Andrew Haschka

The Low-Ops way to reliably and scalably running your containers, Ashmita Kapoor

Spotlight on GKE - Featuring Mahindra Group, Rajesh Shewani, Abhishek Sukhwal

Don't leave your Database behind in your cloud journey, Anant Damle

Increase Productivity and Efficiency with low-code application integrations, Meng-Wai Tan, Aditya MP

Introducing the Cloud Hero Challenge, Russell Nash

Quiz Results & Close, Priyanka Vergadia, Russell Nash


## res_gcp_ai.md

      
    Raw
  

              res_gcp_ai.md
            
          
    1.Vertex AI

Build, deploy, and scale ML models faster, with pre-trained and custom tooling within a unified AI platform.

Build with the groundbreaking ML tools that power Google, developed by Google Research
Deploy more models, faster, with 80% fewer lines code required for custom modeling
Use MLOps tools to easily manage your data and models with confidence and repeat at scale
USE CASES: Explore common ways to take advantage of Vertex AI


Data readiness


1.1.Guides

Vertex AI documentation


Overview
Training and tutorials
Use cases
Code samples

Get started

Prepare data and manage datasets

Train AutoML models

Create a model using custom training

Import models

Configure models and get predictions

All request predictions documentaton

Configure custom-trained models

Pre-built containers for prediction and explanation

Vertex AI provides Docker container images that you run as pre-built containers for serving predictions and explanations from trained model artifacts. These containers, which are organized by machine learning (ML) framework and framework version, provide HTTP prediction servers that you can use to serve predictions with minimal configuration. In many cases, using a pre-built container is simpler than creating your own custom container for prediction.
Available container images

TensorFlow
scikit-learn
XGBoost

Deploy a model

Deploy a model, Cloud console

Get predictions

Get batch predictions

To get batch predictions from a custom-trained model, prepare your input data in one of the following ways:

JSON Lines
TFRecord
CSV

File list:
Create a text file where each row is the Cloud Storage URI to a file. Vertex AI reads each URI as binary, then base64-encodes it and sends it in a JSON instance to the container that serves your model's predictions.
If you plan to use the Google Cloud Console to get batch predictions, paste your file list directly into the Cloud Console. Otherwise save your file list in a Cloud Storage bucket.
Track model quality using Vertex AI Model Monitoring

Orchestrate your ML workflow using Vertex AI pipelines

All Vertex AI Pipelines documentation

Introduction to Vertex Pipelines


Understanding ML pipelines
Key Point:

Pipelines allow you to automate, monitor, and experiment with interdependent parts of a ML workflow.
ML Pipelines are portable, scalable, and based on containers.
Each individual part of your pipeline workflow (for example, creating a dataset or training a model) is defined by code. This code is referred to as a component. Each instance of a component is called a step.

You can use Vertex Pipelines to run pipelines that were built using the Kubeflow Pipelines SDK or TensorFlow Extended. Learn more about choosing between the Kubeflow Pipelines SDK and TFX.


Building a pipeline


Which pipelines SDK should I use?
Vertex Pipelines can run pipelines built using the Kubeflow Pipelines SDK v1.6 or higher, or TensorFlow Extended v0.30.0 or higher.

If you use TensorFlow in an ML workflow that processes terabytes of structured data or text data, we recommend that you build your pipeline using TFX.
For other use cases, we recommend that you build your pipeline using the Kubeflow Pipelines SDK. By building a pipeline with the Kubeflow Pipelines SDK, you can implement your workflow by building custom components or reusing prebuilt components, such as the Google Cloud pipeline components. Google Cloud pipeline components make it easier to use Vertex AI services like AutoML in your pipeline.


Google Cloud Pipeline Components

Introduction to Google Cloud Pipeline Components

Google Cloud Pipeline Components (GCPC) are available through the Google Cloud Pipeline Components SDK provides a set of prebuilt components that are production quality, consistent, performant, and easy to use in Vertex AI Pipelines. You can use these components to perform ML tasks. For example, you can use components to complete the following:

Create a new dataset and load different data types into the dataset (image, tabular, text, or video).
Export data from a dataset to Cloud Storage.
Use AutoML to train a model using image, tabular, text, or video data.
Run a custom training job using a custom container or a Python package.
Upload an existing model to Vertex AI for batch prediction.
Create a new endpoint and deploy a model to it for online predictions.

Track and analyze ML metadata

Use Vertex AI Workbench

Experiment with Vertex AI TensorBoard

Optimize with Vertex AI Vizier

Manage features using Vertex AI Feature Store

1.2.Resources

20210519 Introducing Vertex AI

Vertex AI Simplified

What is Vertex AI

How to manage ML datasets with Vertex AI

Building and training ML models with Vertex AI

Build a custom ML model with Vertex AI

How to build an image classification model in Vertex AI

Introduction to Vertex AI SDK

ML Experiment Tracking with Vertex AI](https://youtu.be/a_YXZ5UltkU)

1.3.Tutorial

20230121 codelab: Make the Most of Experimentation:Manage Machine Learning Experiments with Vertex AI

overview

Intro to Vertex AI

Use Case Overview

Set up your environment

Initial Setup Steps in Your Notebook

Let's Build our Pipeline

Identify the Best Performing Run

Cleanup

20220927 codelab:Intro to Vertex Pipelines

Overview

Intro to Vertex AI

Cloud environment setup

Vertex Pipelines setup

Creating your first pipeline

Creating an end-to-end ML pipeline

Cleanup

20220927 codelab Vertex AI:Custom training job and prediction using managed datasets

The focus of this demo is you can use Vertex AI to train and deploy a ML model. It assumes that you are familiar with Machine Learning even though the machine learning code for training is provided to you. You will use Datasets for dataset creation and managemet, and custom model for training a Scikit Learn model. Finally you will deploy the trained model and get online predictions. The dataset you will use for this demo is the Titanic Dataset.
Better Programming: Vertex AI Tutorial Series

20210607 A Step-by-Step Guide to Training a Model on Google Cloud’s Vertex AI, EN

20210610 A Step-by-Step Guide to Tuning a Model on Google Cloud’s Vertex AI, EN

20210616 How To Operationalize a Model on Google Cloud’s Vertex AI, EN

20210617 How To Use AutoML on Google Cloud’s Vertex AI, EN

20210619 How To Use BigQuery ML on Google Cloud’s Vertex AI, EN

20210625 How to Use Pipeline on Google Cloud’s Vertex AI, EN

Google Cloud Vertex AI Official Notebooks

The official notebooks are a collection of curated and non-curated notebooks authored by Google Cloud staff members. The curated notebooks are linked to in the Vertex AI online web documentation.
Google Cloud Vertex AI Community Notebooks

The community notebooks are not officially maintained by Google.
2.AI Platform

2.1.Guides

Getting started guides, https://cloud.google.com/ai-platform-unified/docs/start

Introduction to AI Platform, https://cloud.google.com/ai-platform/docs/technical-overview

2.2.Resources

31DaysofML, https://twitter.com/hashtag/31DaysofML


Day 1, https://twitter.com/pvergadia/status/1356663694780887042

20210401 No-cost AI and machine learning training opportunities from Google Cloud | Google Cloud Blog, EN, https://cloud.google.com/blog/topics/training-certifications/ai-ml-training-opportunities-from-google-cloud

20170902 Generative Machine Learning on the Cloud, https://medium.com/artists-and-machine-intelligence/generative-machine-learning-on-the-cloud-1ccdfeb33ea2


2.3.Tutorials

3.AutoML

3.1.Guides

Doc

4.AutoML Vision

4.1.Guide

Doc

How-to Guides


Before you begin: Set up your Google Cloud Platform project, authentication, and enable AutoML Vision.
Preparing your training data: Learn best practices in organizing and annotating the images you will use to train your model, as well as format a training CSV file.
Creating datasets and importing images: Create the dataset and import the training data used to train your model.
Training Cloud-hosted models: Train your custom model hosted on the Cloud and get the status of the training operation.
Training Edge (exportable) models: Train your custom exportable Edge model and get the status of the training operation.
Evaluating models: Review the performance of your model.
Deploying models: Deploy your model for use after training completes.
Making individual predictions: Use your custom model to annotate an individual prediction image with labels and bounding boxes online.
Making batch predictions: Use your custom model to annotate a batch of prediction images with labels and bounding boxes online.
Exporting Edge models: Export your different trained Edge model formats to Google Cloud Storage and for use on edge devices.
Undeploying models: Undeploy your model after you are done using them to avoid further hosting charges.
Managing datasets: Manage datasets associated with your project.
Managing models: Manage your custom models.


## res_gcp_analytics.md

      
    Raw
  

              res_gcp_analytics.md
            
          
    1.Analytics

1.1 Guides

1.2 Resources

1.3 Solutions

Drive innovation with Google Cloud smart analytics solutions

Data warehouse modernization

Data lake modernization

Streaming analytics

Business intelligence

Marketing analytics

Geospatial analytics & AI

Datasets

1.4 Tutorials

2.BigQuery

2.1 Guides


2.1.1.Discover

What is BigQuery

Get started with BigQuery
Explore BigQuery

BigQuery storage
BigQuery analytics
BigQuery administration

BigQuery resources
APIs, tools, and references

SQL query syntax for details about using BigQuery SQL query.
BigQuery API and client libraries present overviews of BigQuery's features and their use.
BigQuery code samples provide hundreds of snippets for client libraries in C#, Go, Java, Node.js, Python, Ruby. Or view the sample browser.
DML, DDL, and user-defined functions (UDF) syntax lets you manage and transform your BigQuery data.
bq command-line tool reference documents the syntax, commands, flags, and arguments for the bq CLI interface.
ODBC / JDBC integration connect BigQuery to your existing tooling and infrastructure.

BigQuery roles and resources
BigQuery video tutorials

What is BigQuery? (4:39)
Using the BigQuery sandbox (3:05)
Asking questions, running queries (5:11)
Visualizing query results (5:38)
Managing access with IAM (5:23)
Protecting sensitive data with authorized views (7:12)
Querying external data with BigQuery (5:49)
What are user-defined functions? (4:59)

What's next

For an overview of BigQuery storage, see Overview of BigQuery storage.
For an overview of BigQuery queries, see Overview of BigQuery analytics.
For an overview of BigQuery administration, see Introduction to BigQuery administration.
For an overview of BigQuery security, see Overview of data security and governance.


2.1.2.Get started

2.1.3.Migrate

2.1.4.Design

Work with Datasets

Dataset locations

Work with Tables

Using data manipulation language (DML)

Limitations

Each DML statement initiates an implicit transaction, which means that changes made by the statement are automatically committed at the end of each successful DML statement.
Rows that were written to a table recently by using streaming (the tabledata.insertall method or the Storage Write API) cannot be modified with UPDATE, DELETE, or MERGE statements. The recent writes are those that occur within the last 30 minutes. All other rows in the table remain modifiable by using UPDATE, DELETE, or MERGE statements. The streamed data can take up to 90 minutes to become available for copy operations.
Correlated subqueries within a when_clause, search_condition, merge_update_clause or merge_insert_clause are not supported for MERGE statements.
Queries that contain DML statements cannot use a wildcard table as the target of the query. For example, a wildcard table can be used in the FROM clause of an UPDATE query, but a wildcard table cannot be used as the target of the UPDATE operation.

Introduction to clustered tables

When to use clustering
Both partitioning and clustering can improve performance and reduce query cost.
Use clustering under the following circumstances:

You don't need strict cost guarantees before running the query.
You need more granularity than partitioning alone allows. To get clustering benefits in addition to partitioning benefits, you can use the same column for both partitioning and clustering.
Your queries commonly use filters or aggregation against multiple particular columns.
The cardinality of the number of values in a column or group of columns is large.

Use partitioning under the following circumstances:

You want to know query costs before a query runs. Partition pruning is done before the query runs, so you can get the query cost after partitioning pruning through a dry run. Cluster pruning is done when the query runs, so the cost is known only after the query finishes.
You need partition-level management. For example, you want to set a partition expiration time, load data to a specific partition, or delete partitions.
You want to specify how the data is partitioned and what data is in each partition. For example, you want to define time granularity or define the ranges used to partition the table for integer range partitioning.

Prefer clustering over partitioning under the following circumstances:

Partitioning results in a small amount of data per partition (approximately less than 1 GB).
Partitioning results in a large number of partitions beyond the limits on partitioned tables.
Partitioning results in your mutation operations modifying most partitions in the table frequently (for example, every few minutes).

You can also combine partitioning with clustering. Data is first partitioned and then data in each partition is clustered by the clustering columns.
Creating and using clustered tables

Modify clustering specification
You can change or remove a table's clustering specifications, or change the set of clustered columns in a clustered table. This method of updating the clustering column set is useful for tables that use continuous streaming inserts because those tables cannot be easily swapped by other methods.
You can change the clustering specification in the following ways:

Call the tables.update or tables.patch API method.
Call the bq command-line tool's bq update command with the --clustering_fields flag.

Querying clustered tables

Best practices
To get the best performance from queries against clustered tables, use the following best practices.

Sample table used in the examples

CREATE TABLE
  `mydataset.ClusteredSalesData`
PARTITION BY
  DATE(timestamp)
CLUSTER BY
  customer_id,
  product_id,
  order_id


Filter clustered columns by sort order

SELECT
  SUM(totalSale)
FROM
  `mydataset.ClusteredSalesData`
WHERE
  customer_id = 10000
  AND product_id LIKE 'gcp_analytics%'


Do not use clustered columns in complex filter expressions
For example, the following query will not prune blocks because a clustered column—customer_id—is used in a function in the filter expression.


SELECT
  SUM(totalSale)
FROM
  `mydataset.ClusteredSalesData`
WHERE
  CAST(customer_id AS STRING) = "10000"


Do not compare clustered columns to other columns
The following query does not prune blocks because the filter expression compares a clustered column—customer_id to another column—order_id.


SELECT
  SUM(totalSale)
FROM
  `mydataset.ClusteredSalesData`
WHERE
  customer_id = order_id

Introduction to partitioned tables

Managing partitioned tables

Set partition filter requirements
For information on adding the Require partition filter option when you create a partitioned table, see Creating partitioned tables.
If a partitioned table has the Require partition filter setting, then every query on that table must include at least one predicate that only references the partitioning column. Queries without such a predicate return the following error:
Cannot query over table 'project_id.dataset.table' without a filter that can be used for partition elimination.
Query partitioned tables

Require a partition filter in queries
When you create a partitioned table, you can require the use of predicate filters by enabling the Require partition filter option. When this option is applied, attempts to query the partitioned table without specifying a WHERE clause produce the following error:
Cannot query over table 'project_id.dataset.table' without a filter that can be used for partition elimination.
There must be at least one predicate that only references a partition column for the filter to be considered eligible for partition elimination (@@the predicate must be some literal value, it can't be a result from subquery). For example, for a table partitioned on column partition_id with an additional column f in its schema, both of the following WHERE clauses satisfy the requirement:
WHERE partition_id = "foo"
WHERE partition_id = "foo" AND f = "bar"


External Tables

Introduction to BigLake tables


2.1.5.Load

2.1.6.Analyze

Querying BigQuery Data

Using cached query results

BigQuery writes all query results to a table. The table is either explicitly identified by the user (a destination table), or it is a temporary, cached results table. Temporary, cached results tables are maintained per-user, per-project. There are no storage costs for temporary tables, but if you write query results to a permanent table, you are charged for storing the data.
All query results, including both interactive and batch queries, are cached in temporary tables for approximately 24 hours with some exceptions.
Optimized queries-Use nested and repeated fields

Best practice: Use nested and repeated fields to denormalize data storage and increase query performance.
Query data with SQL

Work with arrays

Querying nested arrays
WITH Races AS (
  SELECT "800M" AS race,
    [STRUCT("Rudisha" AS name, [23.4, 26.3, 26.4, 26.1] AS laps),
     STRUCT("Makhloufi" AS name, [24.5, 25.4, 26.6, 26.1] AS laps),
     STRUCT("Murphy" AS name, [23.9, 26.0, 27.0, 26.0] AS laps),
     STRUCT("Bosse" AS name, [23.6, 26.2, 26.5, 27.1] AS laps),
     STRUCT("Rotich" AS name, [24.7, 25.6, 26.9, 26.4] AS laps),
     STRUCT("Lewandowski" AS name, [25.0, 25.7, 26.3, 27.2] AS laps),
     STRUCT("Kipketer" AS name, [23.2, 26.1, 27.3, 29.4] AS laps),
     STRUCT("Berian" AS name, [23.7, 26.1, 27.0, 29.3] AS laps)]
       AS participants)
SELECT
  race,
  participant
FROM Races AS r,UNNEST(r.participants) AS participant;

Querying external data sources

Introduction

BigQuery offers support for querying data directly from:

Cloud Bigtable
Cloud Storage
Google Drive
Cloud SQL


2.1.7.Administrater

Manage resources

Introduction

Optimize resources

Introduction to optimizing query performance, ★★★

Query performance
When evaluating query performance in BigQuery, the amount of work required depends on a number of factors:

Input data and data sources (I/O): How many bytes does your query read?
Communication between nodes (shuffling): How many bytes does your query pass to the next stage? How many bytes does your query pass to each slot?
Computation: How much CPU work does your query require?
Outputs (materialization): How many bytes does your query write?
Query anti-patterns: Are your queries following SQL best practices?

Optimize query computation, ★★★

The following best practices provide guidance on controlling query computation.

Avoid repeatedly transforming data through SQL queries
Avoid JavaScript user-defined functions
Use approximate aggregation functions
Use aggregate analytic function to obtain the latest record
Order query operations to maximize performance
Optimize your join patterns
Use INT64 data types in joins to reduce cost and improve comparison performance
Prune partitioned queries
Avoid multiple evaluations of the same Common Table Expressions (CTEs)
Split complex queries into multiple smaller ones

Optimize storage in BigQuery

2.1.8.Govern

2.1.9.Develop

Use the BigQuery Storage Read API to read table data

The BigQuery Storage Read API provides fast access to BigQuery-managed storage by using an rpc-based protocol.
Background

Historically, users of BigQuery have had two mechanisms for accessing BigQuery-managed table data:

Record-based paginated access by using the tabledata.list or jobs.getQueryResults REST API methods. The BigQuery API provides structured row responses in a paginated fashion appropriate for small result sets.
Bulk data export using BigQuery extract jobs that export table data to Cloud Storage in a variety of file formats such as CSV, JSON, and Avro. Table exports are limited by daily quotas and by the batch nature of the export process.

The BigQuery Storage Read API provides a third option that represents an improvement over prior options. When you use the Storage Read API, structured data is sent over the wire in a binary serialization format. This allows for additional parallelism among multiple consumers for a set of results.
The Storage Read API does not provide functionality related to managing BigQuery resources such as datasets, jobs, or tables.
2.2 Reference

2.2.1.BigQuery APIs

2.2.2.BigQuery CLI

2.2.3.SQL in BigQuery, Standard SQL reference

Statements

Query syntax

Using data definition language (DDL) statements

Quotas and limits

Python client for BQ

BQ code samples

2.2.4 Python Client for Google BigQuery, ★★★


https://cloud.google.com/python/docs/reference/bigquery/latest/google.cloud.bigquery.job.QueryJob
https://cloud.google.com/python/docs/reference/bigquery/latest/google.cloud.bigquery.client.Client#google_cloud_bigquery_client_Client_get_job
https://cloud.google.com/python/docs/reference/bigquery/latest/google.cloud.bigquery.job.QueryJob#google_cloud_bigquery_job_QueryJob_state

from google.cloud import bigquery
client = bigquery.Client()
job = client.get_job(job_id='45f069ef-660f-41d1-8204-2f0dcf127673')
print(job.state)
#
client.cancel_job(job_id='45f069ef-660f-41d1-8204-2f0dcf127673')

2.3 Resources

Built-in Python Tutorials:

Vertex AI -> Notebook -> /tutorials/bigquery
SQLAlchemy Dialect for BigQuery

20210629 3 ways to query BigQuery in Python: SQLAlchemy, Python Client for Google BigQuery, and bq command-line tool

20201116 Simplify BigQuery ETL jobs using SQLAlchemy

2.4 Solutions

Migrating data warehouses to BigQuery

3.BigQuery ML

3.1.Guides

3.2.Resources

3.3.Tutorials

3.3.1.GCP Tutorials

Creating a regression model to predict penguin weight

This tutorial uses a linear regression model in BigQuery to predict the weight of a penguin.
Creating a classification model on census data

This tutorial uses a binary logistic regression model in BigQuery ML to predict the income range of respondents in the US Census Dataset
Creating a k-means model to cluster London bicycle hires dataset

This tutorial uses a k-means model in BigQuery ML to identify clusters of data in the London Bicycle Hires public dataset.
Creating a matrix factorization model to make movie recommendations

This tutorial uses the public movielens dataset to create a model from explicit feedback that generates movie recommendations for a user.
Creating a matrix factorization model to make recommendations from Google Analytics data

This tutorial uses the BigQuery Google Analytics sample to create a model from implicit feedback that recommends content for a visitor to a website.
Single time-series forecasting from Google Analytics data

This tutorial creates a time series model to perform single time-series forecasts using the google_analytics_sample.ga_sessions sample table.
Multiple time-series forecasting with a single query for NYC Citi Bike trips

This tutorial creates a set of time-series models to perform multiple time-series forecasts with a single query. You will use the new_york.citibike_trips data. This data contains information about Citi Bike trips in New York City.
Scalable forecasting with millions of time-series in BigQuery

This tutorial uses a set of techniques to enable 100x faster forecasting without sacrificing much forecasting accuracy. It enables forecasting millions of time series within hours using a single query.
Using TRANSFORM clause for feature engineering

This tutorial uses the BigQuery TRANSFORM clause for feature engineering to create a model that predicts the birth weight of a child.
Using hyperparameter tuning to improve model performance

This tutorial uses the tlc_yellow_trips_2018 sample table to create a model with hyperparameter tuning that predicts the tip of a taxi trip.
Importing Tensorflow models to make predictions

This tutorial imports a TensorFlow model into a BigQuery ML dataset and use it to make predictions from a SQL query.
Exporting a BigQuery ML model for online prediction

This tutorial exports a BigQuery ML model and then deploys the model either on AI Platform or on a local machine. You will use the iris table from the BigQuery public datasets.
3.3.2.Codelabs

Image Data Classification with BigQuery ML

4.Cloud Catalog

4.1 Guides

Overview

What is Data Catalog

Data Catalog metadata

Tags and tag templates

Sample tags attached to a data entry


Entries and entry groups

Data Lineage

About data lineage


Discover Data

Search and view data assets with Data Catalog

Data Catalog search syntax

Simple search

In its simplest form, a Data Catalog search query comprises a single predicate. Such a predicate can match several pieces of metadata:

A substring of a name, display name, or description of a data asset
Exact type of a data asset
A substring of a column name (or nested column name) in the schema of a data asset
A substring of a project ID
The value of a public tag, the name of a public tag template, or a field name in a public tag template attached to a data entry.
(Preview) A string for an email address or name for a data steward
(Preview) A string from an overview description

Qualified predicates


An equal sign (=) restricts the search to an exact match.
A colon (:) after the key matches the predicate to either a substring or token within the value in search results.

Data Catalog supports the following qualifiers:


Qualifier
Description


name:x
Matches x as a substring of the data asset ID.


displayname:x
Match x as a substring of the data asset display name.


column:x
Matches x as a substring of the column name (or nested column name) in the schema of the data asset. Currently, you can search for a nested column by its path using the AND logical operator. For example, column:(foo bar) matches a nested column with the foo.bar path.


APIs & Reference

Overview of APIs and Client Libraries

video: Google Cloud Data Catalog essentials

Google Cloud Data Catalog essentials: Data Discovery

Google Cloud Data Catalog essentials: Adding schematized tags and IAM

Google Cloud Data Catalog essentials: Identifying PII with Cloud DLP integration

4.2 Resources

20221116 GCP Data Catalog Schema History or Versioning

Unfortunately currently there is no versioning in Data Catalog.
20201007 Create a Data Catalog tag history in BigQuery using Cloud Logging and Dataflow

This tutorial suggests a solution to create a historical record of metadata Data Catalog tags by creating change records in real time by capturing and parsing the audit logs from Cloud Logging and processing them in real time by using Pub/Sub and Dataflow to append into a BigQuery table for historical analysis.

Understanding the fundamentals of tagging in Data Catalog

Prebuilt templates

Another common question we hear from potential clients is: Do you have prebuilt templates to help us get started with creating our own? Due to the popularity of this request, we created a few examples to illustrate the types of templates being deployed by our users. You can find them in YAML format below and through a GitHub repo. There is also a script in the same repo that reads the YAML-based templates and creates the actual templates in Data Catalog.
Practice using INFORMATION_SCHEMA and TABLES to explore metadata

BigQuery stores metadata about each object stored in it. You can query these metadata tables to get a better understanding of a dataset and it's contents. See documentation.
Querying a Bitcoin dataset in BigQuery with nested and repeated columns

BQ nested and repeated columns allow you to achieve the performance benefits of denormalization while retaining the structure of the data.
To illustrate, consider this query against a Bitcoin dataset. The query joins the blocks and transactions tables to find the max transaction ID for each block.
4.3 Tutorials

5.Dataplex

5.1 Guides

What is Dataplex? 


Build a data mesh

Objectives
In following this guide, you use the Dataplex entities to build a data mesh architecture:

Create a Dataplex lake that will act as the domain for your data mesh.
Add zones to your lake that will represent individual teams within each domain and provide managed data contracts.
Attach assets that map to data stored in Cloud Storage.

Data quality tasks overview

Dataplex data quality tasks enable you to define and execute data quality checks across tables in BigQuery and Cloud Storage. Dataplex data quality tasks allow you to apply regular data controls in BigQuery environments.
When to create Dataplex data quality tasks

You want to validate data as part of the data production pipeline.
You want to routinely monitor quality of datasets against your expectations.
You want to build data quality reports for regulatory requirements.


## res_gcp_anthos.md

      
    Raw
  

              res_gcp_anthos.md
            
          
    Anthos

Guides


Anthos technical overview, https://cloud.google.com/anthos/docs/concepts/overview


## res_gcp_container.md

      
    Raw
  

              res_gcp_container.md
            
          
    1.Cloud Run

What is Cloud Run

Services and jobs: two ways to run your code

On Cloud Run, your code can either run continuously as a service or as a job. Both services and jobs run in the same environment and can use the same integrations with other services on Google Cloud.

Cloud Run services. Used to run code that responds to web requests, or events.
Cloud Run jobs. Used to run code that performs work (a job) and quits when the work is done.

When to use Cloud Run services

Cloud Run services are great for code that handles requests or events. Example use cases include:


Websites and web applications
Build your web app using your favorite stack, access your SQL database, and render dynamic HTML pages.


APIs and microservices
You can build a REST API, or a GraphQL API or private microservices that communicate over HTTP or gRPC.


Streaming data processing
Cloud Run services can receive messages from Pub/Sub push subscriptions and events from Eventarc.


When to use Cloud Run jobs

Cloud Run jobs are well-suited to run code that performs work (a job) and quits when the work is done. Here are a few examples:


Script or tool
Run a script to perform database migrations or other operational tasks.


Array job
Perform highly parallelized processing of all files in a Cloud Storage bucket.


Scheduled job
Create and send invoices at regular intervals, or save the results of a database query as XML and upload the file every few hours.


1.1.Guides

Concepts

Container runtime contract

Quickstarts (https://cloud.google.com/run/docs/quickstarts)

Create and execute a job

Build and create a job:Python

# Retrieve Job-defined env vars
TASK_INDEX = os.getenv("CLOUD_RUN_TASK_INDEX", 0)
TASK_ATTEMPT = os.getenv("CLOUD_RUN_TASK_ATTEMPT", 0)
# Retrieve User-defined env vars
SLEEP_MS = os.getenv("SLEEP_MS", 0)
FAIL_RATE = os.getenv("FAIL_RATE", 0)


# Define main script
def main(sleep_ms=0, fail_rate=0):

gcloud run jobs create job-quickstart \
    --image gcr.io/PROJECT_ID/logger-job \
    --tasks 50 \
    --set-env-vars SLEEP_MS=10000 \
    --set-env-vars FAIL_RATE=0.5 \
    --max-retries 5 \
    --region REGION
    --project=PROJECT_ID

Execute background jobs

Create jobs

You can structure a job as a single task or as multiple, independent tasks (up to 10,000 tasks) that can be executed in parallel. Each task runs one container instance and can be configured to retry in case of failure. Each task is aware of its index, which is stored in the CLOUD_RUN_TASK_INDEX environment variable. The overall count of tasks is stored in the CLOUD_RUN_TASK_COUNT environment variable. If you are processing data in parallel, your code is responsible for determining which task handles which subset of the data.
gcloud run jobs create JOB_NAME --image IMAGE_URL OPTIONS

Execute Jobs

Execute jobs

gcloud run jobs execute JOB_NAME
gcloud run jobs create JOB_NAME --execute-now
How-to guides

Develop

Is my app a good fit for Cloud Run?

In order to be a good fit for Cloud Run, your app needs to meet all of the following criteria. See the Cloud Run container contract for more information.

Serves requests, streams, or events delivered via HTTP, HTTP/2, WebSockets, or gRPC, or executes to completion.
Does not require a local persistent file system, but either a local ephemeral file system or a network file system.
Is built to handle multiple instances of the app running simultaneously.
Does not require more than 8 CPU and 32 GiB of memory per instance.
Meets one of the following criteria:

Is containerized.
Is written in Go, Java, Node.js, Python, or .NET.
You can otherwise containerize it.


1.2.Resources

1.4.Tutorials

codelab

Getting started with Cloud Run jobs


## res_gcp_database.md

      
    Raw
  

              res_gcp_database.md
            
          
    1.BigTable

Guide

Understanding Bigtable performance

2.MySQL

Guides

Create and manage

Use best practices

General best practices

Instance configuration and administration
Data architecture
Application implementation
Data import and export
Backup and recovery
Operational guidelines

The Cloud SQL SLA agreement excludes outages "caused by factors outside of Google’s reasonable control". This page describes some of the user-controlled configurations that can cause an outage for a Cloud SQL instance to be excluded.

  
## res_gcp_datapipes.md

      
    Raw
  

              res_gcp_datapipes.md
            
          
    1.Data Pipelines

1.1.Guides

1.2.Resources


20210618 Orchestrating your data workloads in Google Cloud, EN
Services like Data Fusion, Dataflow and Dataproc are great for ingesting, processing and transforming your data. These services are designed to operate directly on big data and can build both batch and real time pipelines that support the performant aggregation (shuffling, grouping) and scaling of data. This is where you should build your data pipelines and you can use Composer to manage the execution of these services as part of a wider workflow.


20210220 Architect your data lake on Google Cloud with Data Fusion and Composer, EN

  
20201205 Get to know Google Cloud Workflows, EN

Google Cloud’s first general purpose workflow orchestration tool was Cloud Composer.
However, if you want to process events or chain APIs in a serverless way—or have workloads that are bursty or latency-sensitive—we recommend Workflows.
2.Dataproc

2.1.Guides

Compute options

Attaching GPUs to clusters

Dataproc provides the ability for graphics processing units (GPUs) to be attached to the master and worker Compute Engine nodes in a Dataproc cluster. You can use these GPUs to accelerate specific workloads on your instances, such as machine learning and data processing.
gcloud dataproc

cluster-create

2.2.Resources

2.2.1.gibhub: code and doc

Dataproc projects

Dataproc initialization actions

GCP Token Broker

Dataproc Custom Images

Dataproc Spawner

Connectors

Hadoop/Spark GCS Connector

Spark BigQuery Connector

Hadoop BigQuery Connector

Spark Pubsub Connector

Spark Spanner Connector

Hive Bigquery Storage Handler

Kubernetes Operators

Spark kubernetes operator

Flink kubernetes operator

Examples

Dataproc Python examples

Dataproc Pubsub Spark Streaming example

Dataproc Java Bigtable sample

Dataproc Spark-Bigtable samples

20221109 Serverless Spark Architecture & Deep Dive


Dataproc Serverless for Spark
BigQuery Stored Procedure for Apache Spark
Serverless Spark with Vertex AI for Spark MLOps
SparkSQL Workbench for Data Exploration in Dataplex
Dataproc on GKE for Spark

2.3.Tutorials

Use Presto with Dataproc


Use the Cloud Storage connector with Apache Spark


before you begin

Copy public data to your Cloud Storage bucket. Copy a public data Shakespeare text snippet into the input folder of your Cloud Storage bucket:


gsutil cp gs://pub/shakespeare/rose.txt \
    gs://${BUCKET_NAME}/input/rose.txt


Prepare the Spark wordcount job

3.Dataproc Serverless

3.1.Guides

Run an Apache Spark batch workload

Setup

Submit a Spark batch workload

Estimate workload costs

Use the BigQuery connector with Dataproc Serverless for Spark

Use the spark-bigquery-connector with Apache Spark to read and write data from and to BigQuery. This tutorial demonstrates a PySpark application that uses the spark-bigquery-connector.
Use custom containers with Dataproc Serverless for Spark

Dataproc Serverless for Spark runs workloads within Docker containers. The container provides the runtime environment for the workload's driver and executor processes. By default, Dataproc Serverless for Spark uses a container image that includes the default Spark, Java, Python and R packages associated with a runtime release version. The Dataproc Serverless for Spark batches API allows you to use a custom container image instead of the default image. Typically, a custom container image adds Spark workload Java or Python dependencies not provided by the default container image. Important: Do not include Spark in your custom container image; Dataproc Serverless for Spark will mount Spark into the container at runtime.
Dataproc Serverless for Spark autoscaling

When you submit your Spark workload, Dataproc Serverless for Spark can dynamically scale workload resources, such as the number of executors, to run your workload efficiently. Dataproc Serverless autoscaling is the default behavior, and uses Spark dynamic resource allocation to determine whether, how, and when to scale your workload.
Spark properties

You can set Spark properties when you submit a Spark batch workload.
3.2.Resources

Dataproc Templates

Google is providing this collection of pre-implemented Dataproc templates as a reference and to provide easy customization for developers wanting to extend their functionality.
Serverless Spark Solution Accelerators

This repository contains Serverless Spark on GCP solution accelerators built around common use cases - helping data engineers and data scientists with Apache Spark experience ramp up faster on Serverless Spark on GCP.
4.Data Fusion

4.1.Guides

Quickstarts

Creating a reusable pipeline

This tutorial shows how to build a reusable pipeline that reads data from Cloud Storage, performs data quality checks, and writes to Cloud Storage.
Reusable pipelines have a regular pipeline structure, but you can change the configuration of each pipeline node based on configurations provided by an HTTP server. For example, a static pipeline might read data from Cloud Storage, apply transformations, and write to a BigQuery output table. If instead you want the transformation and BigQuery output table to change based on the Cloud Storage file that the pipeline reads, you create a reusable pipeline.
5.Dataflow

5.2.Guide

Use custom containers in Dataflow

When Dataflow launches worker VMs, it uses Docker container images to launch containerized SDK processes on the workers. You can specify a custom container image instead of using one of the default Apache Beam images. When you specify a custom container image, Dataflow launches workers that pull the specified image. The following list includes reasons you might use a custom container:

  
## res_gcp_gcs.md

      
    Raw
  

              res_gcp_gcs.md
            
          
    Google Cloud Storage

Guides

Resources


Quotas and limits, https://cloud.google.com/storage/quotas


Python client for GCS, https://googleapis.dev/python/storage/latest/index.html


GCS code samples, https://cloud.google.com/storage/docs/samples


Built-in Python Tutorials:
Vertex AI -> Notebook -> /tutorials/storage


Solutions


## res_gcp_gke.md

      
    Raw
  

              res_gcp_gke.md
            
          
    GKE

Guides


Authenticating to Google Cloud with service accounts, https://cloud.google.com/kubernetes-engine/docs/tutorials/authenticating-to-cloud-platform
This tutorial covers the following steps:

How to create a service account
How to assign necessary roles for your service account to work with Pub/Sub
How to save the account key as a Kubernetes Secret
How to use the service account to configure and deploy an application


Resources


20210423, AWS Simple Storage Service (S3) Vs Azure Blob Storage Vs GCP Cloud Storage, https://cloudaffaire.com/aws-simple-storage-service-s3-vs-azure-blob-storage-vs-gcp-cloud-storage/


## res_gcp_network.md

      
    Raw
  

              res_gcp_network.md
            
          
    Virtual Private Cloud

Guides

Access APIs and services

Serverless VPC Access

Serverless VPC Access

Serverless VPC Access makes it possible for you to connect directly to your Virtual Private Cloud network from serverless environments such as Cloud Run, App Engine, or Cloud Functions. Configuring Serverless VPC Access allows your serverless environment to send requests to your VPC network using internal DNS and internal IP addresses (as defined by RFC 1918 and RFC 6598). The responses to these requests also use your internal network.
Serverless VPC Access example (click to enlarge)


## res_gcp_ops_suite.md

      
    Raw
  

              res_gcp_ops_suite.md
            
          
    1.Cloud Trace

Guides

Overview to instrumenting for Cloud Trace

For your application to submit traces to Cloud Trace, it must be instrumented. You can instrument your code by using the Google client libraries. However, it's recommended that you use OpenTelemetry or OpenCensus to instrument your application. These are open source tracing packages. OpenTelemetry is actively in development and is the preferred package.
2.Cloud Monitoring

Guides

Value types and metric kinds

Metric kind
Each time series includes the metric kind (type MetricKind) for its data points. The kind of metric data tells you how to interpret the values relative to each other. Cloud Monitoring metrics are one of three kinds:


A gauge metric, in which the value measures a specific instant in time. For example, metrics measuring CPU utilization are gauge metrics; each point records the CPU utilization at the time of measurement. Another example of a gauge metric is the current temperature.


A delta metric, in which the value measures the change since it was last recorded. For example, metrics measuring request counts are delta metrics; each value records how many requests were received since the last data point was recorded.


A cumulative metric, in which the value constantly increases over time. For example, a metric for “sent bytes” might be cumulative; each value records the total number of bytes sent by a service at that time.


3.GCP Logging

Guides

Monitor Logs

Create metrics from logs

Log-based metrics overview

Log-based metrics derive metric data from the content of log entries. For example, you can use a log-based metric to count the number of log entries that contain a particular message or to extract latency information recorded in log entries. You can use log-based metrics in Cloud Monitoring charts and alerting policies.
Configure counter metrics

This document explains how to create a counter-type log-based metric by using the Google Cloud console, the Logging API, and the Google Cloud CLI.
4.GCP Open Telemetry

Guides

Google Cloud OpenTelemetry’s documentation

Cloud Monitoring Exporter Example

Cloud Resources Detector Example]()

Cloud Trace Exporter Example]()

Cloud Trace Propagator Example]()

End-to-End Example with Flask]()

Resources

What is OpenTelemetry?


## res_gcp_security.md

      
    Raw
  

              res_gcp_security.md
            
          
    Authentication

Guides

Resources


Python client for Google Auth, googleapis.dev/python/google-auth/latest/index.html
IAM code samples, cloud.google.com/iam/docs/samples

Solutions


## res_gcp_shell.md

      
    Raw
  

              res_gcp_shell.md
            
          
    Cloud Shell

env vars


name
desc
example


${GOOGLE_CLOUD_PROJECT}
GCP Project ID
qwiklabs-gcp-02-e9a6b0bbe66d


gsutil

Global Command Line Options

commands

cp

examples

gsutil cp -z html -a public-read cattypes.html tabby.jpeg gs://mycats
gsutil -m cp -r dir gs://my-bucket
gsutil -o GSUtil:parallel_composite_upload_threshold=150M cp bigfile gs://your-bucket
gsutil ls –[l or lr] gs://[BUCKET_NAME]/**
gsutil lifecycle set [LIFECYCLE_JSON-CONFIG_FILE] gs://[BUCKET_NAME]
gsutil mb -c nearline gs://archive_bucket

gcloud

gcloud auth

gcloud auth overview

gcloud auth list

# list the active account name
gcloud auth list 
Credentialed accounts:
- google1623327_student@qwiklabs.net

gcloud config

gcloud config overview

# ist the project ID
gcloud config list project 
[core]
project = qwiklabs-gcp-44776a13dea667a6

# set your Project ID
gcloud config set project <YOUR_PROJECT_ID> 

# set it as an environment variable
export PROJECT_ID=$(gcloud config get-value project) 

gcloud container

gcloud container overview

gcloud container clusters overview

gcloud container clusters resize NAME (--num-nodes=NUM_NODES | --size=NUM_NODES) [--async] [--node-pool=NODE_POOL] [--region=REGION | --zone=ZONE, -z ZONE]  [GCLOUD_WIDE_FLAG …]
gcloud container clusters update CLUSTER_NAME --enable-autoscaling --min-nodes=1 --max-nodes=10

gcloud projects

gcloud projects overview

gcloud projects add-iam-policy-binding

gcloud services

gcloud services overview

gcloud services list –enabled
gcloud services list --available


## res_gcp_workflow.md

      
    Raw
  

              res_gcp_workflow.md
            
          
    1.Service Choreography / Orchestration

1.2.Resources

20211006 Service orchestration on Google Cloud, EN


Service Choreography -  With service choreography, each service works independently and interacts with other services in a loosely coupled way through events. Loosely coupled events can be changed and scaled independently, which means there is no single point of failure. But, so many events flying around between services makes it quite hard to monitor. Business logic is distributed and spans across multiple services, so there is no single, central place to go for troubleshooting. There's no central source of truth to understand the system. Understanding, updating and troubleshooting are all distributed


Service Orchestration - To handle the monitoring challenges of choreography, developers need to bring structure to the flow of events, while retaining the loosely coupled nature of event-driven services. Using service orchestration, the services interact with each other via a central orchestrator that controls all interactions between the services. This orchestrator provides a high-level view of the business processes to track execution and troubleshoot issues. In Google Cloud,  Workflows  handles service orchestration.


Product
Description


Cloud Workflows
Suuport service orchestratio. Workflows is a fully-managed serverless service to orchestrate and automate Google Cloud and HTTP-based API services with serverless workflows. Workflows is particularly helpful with Google Cloud services that perform long-running operations, as Workflows will wait for them to complete, even if they take hours. With callbacks, Workflows can wait for external events for days or months. You can use either YAML or JSON to express your workflow.


Cloud Pub/Sub
Support service choreography. Pub/Sub enables services to communicate asynchronously, with latencies on the order of 100 milliseconds. Pub/Sub is used for messaging-oriented middleware for service integration or as a queue to parallelize tasks.


Cloud Eventarc
Support service choreography. Eventarc enables you to build event-driven architectures without having to implement, customize, or maintain the underlying infrastructure. Any service with Audit Log integration or any application that can send a message to a Pub/Sub topic can be event sources for Eventarc.


Cloud Task
Support service choreography. Cloud Tasks lets you separate out pieces of work that can be performed independently, outside of your main application flow, and send them off to be processed asynchronously using handlers that you create. Difference between Pub/Sub and Cloud Tasks. Pub/Sub supports implicit invocation: a publisher implicitly causes the subscribers to execute by publishing an event. Cloud Tasks is aimed at explicit invocation where the publisher retains full control of execution including specifying an endpoint where each message is to be delivered. Unlike Pub/Sub, Cloud Tasks provides tools for queue and task management including scheduling specific delivery times, rate controls, retries, and deduplication.


Cloud Scheduler
Support service choreography. With Cloud Scheduler, you set up scheduled units of work to be executed at defined times or regular intervals, commonly known as cron jobs. Cloud Scheduler can trigger a workflow (orchestration) or generate a Pub/Sub message (choreography). Cloud Scheduler uses cron scheduling to trigger the execution of HTTP-based services at a schedule you define.


20210422 Choosing the right orchestrator in Google Cloud, EN


20210116 Eventarc: A unified eventing experience in Google Cloud | Google Cloud Blog

Eventarc provides an easier path to receive events not only from Pub/Sub topics but from a number of Google Cloud sources with its 'Audit Log' and Pub/Sub integration. Any service with Audit Log integration or any application that can send a message to a Pub/Sub topic can be event sources for Eventarc.
In Eventarc, different events from different sources are converted to 'CloudEvents' compliant events. CloudEvents is a specification for describing event data in a common way with the goal of consistency, accessibility and portability.
20201202 Better service orchestration with Workflows, EN

In Orchestration, a central service defines and controls the flow of communication between services. With centralization, it becomes easier to change and monitor the flow and apply consistent timeout and error policies. 
In Choreography, each service registers for and emits events as they need. There’s usually a central event broker to pass messages around, but it does not define or direct the flow of communication. This allows services that are truly independent at the expense of less traceable and manageable flow and policies. 
In orchestration vs choreography debate, there is no right answer. If you’re implementing a well-defined process with a bounded context, something you can picture with a flow diagram, orchestration is often the right solution. If you’re creating a distributed architecture across different domains, choreography can help those systems to work together.
After Lambda: Exactly-once processing in Cloud Dataflow,

20170510 part 1

20170530 part 2

20170706 Part 3

2.PubSub


2.1.Guides

Pub/Sub: A Google-Scale Messaging Service


In this scenario, there are two publishers publishing messages on a single topic. There are two subscriptions to the topic.
The first subscription has two subscribers, meaning messages will be load-balanced across them, with each subscriber receiving a subset of the messages.
The second subscription has one subscriber that will receive all of the messages.
The bold letters represent messages. Message A comes from Publisher 1 and is sent to Subscriber 2 via Subscription 1, and to Subscriber 3 via Subscription 2. Message B comes from Publisher 2 and is sent to Subscriber 1 via Subscription 1 and to Subscriber 3 via Subscription 2.

Streaming with Pub/Sub


Building streaming pipelines with Pub/Sub
Pub/Sub and Dataflow integration features

Low latency watermarksLow latency watermarks
High watermark accuracy
Efficient deduplication


Pub/Sub subscription comparison table

Pull subscription

Large volume of messages (GBs per second).
Efficiency and throughput of message processing is critical.
Environments where a public HTTPS endpoint with a non-self-signed SSL certificate is not feasible to set up.

Push subscription

Multiple topics that must be processed by the same webhook.
App Engine Standard and Cloud Functions subscribers.
Environments where Google Cloud dependencies (such as credentials and the client library) are not feasible to set up.

Export subscription

Large volume of messages that can scale up to multiple millions of messages per second.
Messages are directly sent to a Google Cloud resource without any additional processing.

Choosing Pub/Sub or Pub/Sub Lite

Pub/Sub offers a broader range of features, per-message parallelism, global routing, and automatically scaling resource capacity.
Pub/Sub Lite can be as much as an order of magnitude less expensive, but offers lower availability and durability. In addition, Pub/Sub Lite requires you to manually reserve and manage resource capacity.
Quotas and limits


Resource
Limits


Project
10,000 topics; 10,000 attached or detached subscriptions; 5,000 snapshots; 10,000 schemas


Topic
10,000 attached subscriptions; 5,000 attached snapshots


Subscription
Retains unacknowledged messages in persistent storage for 7 days from the moment of publication. There is no limit on the number of retained messages. If subscribers don't use a subscription, the subscription expires. The default expiration period is 31 days.


Schema
Schema size (the definition field): 10KB


Publish request
10MB (total size); 1,000 messages


Message
Message size (the data field): 10MB; Attributes per message: 100; Attribute key size: 256 bytes; Attribute value size: 1024 bytes


Push outstanding messages
3,000 * N by default. 30,000 * N for subscriptions that acknowledge >99% of messages and average <1s of push request latency. N is the number of publish regions. For more information, see Using push subscriptions.


StreamingPull streams
10 MB/s per open stream


Pull/StreamingPull messages
The service might impose limits on the total number of outstanding StreamingPull messages per connection. If you run into such limits, increase the rate at which you acknowledge messages and the number of connections you use.


Quota mismatches
Quota mismatches can happen when published or received messages are smaller than 1000 bytes. For example:
If you publish 10 500-byte messages in separate requests, your publisher quota usage will be 10,000 bytes. This is because messages that are smaller than 1000 bytes are automatically rounded up to the next 1000-byte increment.
If you receive those 10 messages in a single pull response, your subscriber quota usage might be only 5 kB, since the actual size of each message is combined to determine the overall quota.
The inverse is also true. The subscriber quota usage might be greater than the publisher quota usage if you publish multiple messages in a single publish request or receive the messages in separate Pull requests.
2.3.Resources

Pub/Sub Made Easy

Cloud Pub/Sub Overview - ep. 1

What is Cloud Pub/Sub? - ep. 2

Cloud Pub/Sub in Action - ep. 3

Cloud Pub/Sub Publishers - ep. 4

Cloud Pub/Sub Subscribers - ep. 5

Push or Pull Subscriber? - ep. 6

Receiving messages using Pull - ep. 7

Receiving Messages using Push to Cloud Function - ep.8

Using Cloud Pub/Sub with Cloud Run - ep. 9

Replaying and discarding messages - ep. 10

Choosing Pub Sub or Pub Sub Lite? - ep. 11

3.Cloud Task

3.1.Guides

Cloud Tasks versus Cloud Scheduler


Feature
Cloud Scheduler
Cloud Tasks


Triggering
Triggers actions at regular fixed intervals. You set up the interval when you create the cron job, and the rate does not change for the life of the job.
Triggers actions based on how the individual task object is configured. If the scheduleTime field is set, the action is triggered at that time. If the field is not set, the queue processes its tasks in a non-fixed order.


Setting rates
Initiates actions on a fixed periodic schedule. Once a minute is the most fine-grained interval supported.
Initiates actions based on the amount of traffic coming through the queue. You can set a maximum rate when you create the queue, for throttling or traffic smoothing purposes, up to 500 dispatches per second.


Naming
Except for the time of execution, each run of a cron job is exactly the same as every other run of that cron job.
Each task has a unique name, and can be identified and managed individually in the queue.


Handling failure
If the execution of a cron job fails, the failure is logged. If retry behavior is not specifically configured, the job is not rerun until the next scheduled interval.
If the execution of a task fails, the task is re-tried until it succeeds. You can limit retries based on the number of attempts and/or the age of the task, and you can control the interval between attempts in the configuration of the queue.


4.Cloud Workflow

4.1.Guides

[Choose Workflows or Cloud Composer for service orchestration](https://cloud.google.com/workflows/docs/choose-orchestration

Workflows orchestrates multiple HTTP-based services into a durable and stateful workflow. It has low latency and can handle a high number of executions. It's also completely serverless.
Workflows is great for chaining microservices together, automating infrastructure tasks like starting or stopping a VM, and integrating with external systems. Workflows connectors also support simple sequences of operations in Google Cloud services such as Cloud Storage and BigQuery.
Cloud Composer is designed to orchestrate data driven workflows (particularly ETL/ELT). It's built on the Apache Airflow project, but Cloud Composer is fully managed. Cloud Composer supports your pipelines wherever they are, including on-premises or across multiple cloud platforms. All logic in Cloud Composer, including tasks and scheduling, is expressed in Python as Directed Acyclic Graph (DAG) definition files.
Cloud Composer is best for batch workloads that can handle a few seconds of latency between task executions. You can use Cloud Composer to orchestrate services in your data pipelines, such as triggering a job in BigQuery or starting a Dataflow pipeline. You can use pre-existing operators to communicate with various services, and there are over 150 operators for Google Cloud alone.
Detailed feature comparison


Feature
Workflows
Cloud Composer


Syntax
Workflows syntax in YAML or JSON format
Python


State model
Imperative flow control
Declarative DAG with automatic dependency resolution


Integrations
HTTP requests and connectors
Airflow Operators and Sensors


Passing data between steps
64KB for variables
48KB for XCom


Execution triggers and scheduling
gcloud CLI, Cloud Console, Workflows API, Workflows client libraries, Cloud Scheduler
Cron-like schedules in the DAG definition file, Airflow Sensors


Asynchronous patterns
Polling; Waiting for long-running Google Cloud operations
Polling


Parallel execution
Available via experimental.executions.map
Automatic based on dependencies


Execution latency
Milliseconds
Seconds


Based on open source
No
Yes (Apache Airflow)


Scaling model
Serverless (scales up to demand and down to zero)
Provisioned


Billing model
Usage-based (per step executed)
Based on provisioned capacity


Data processing features
No
Backfills, ability to re-run DAGs
Machine learning workflow step	Recommended tools and products
ML environment setup	Notebooks, Vertex SDK for Python
ML development	BigQuery, Cloud Storage, Notebooks, Vertex Data Labeling, Vertex Explainable AI, Vertex Feature Store, Vertex TensorBoard, Vertex Training
Data processing	BigQuery, Dataflow, Dataproc, Managed datasets, TensorFlow Extended
Operationalized training	Cloud Storage, PyTorch, TensorFlow Core, Vertex Feature Store, Vertex Pipelines, Vertex Training
Model deployment and serving	Vertex Prediction, ML workflow orchestration, Kubeflow Pipelines, TensorFlow Extended, Vertex Pipelines
Artifact organization	Artifact Registry
Model monitoring	Vertex Explainable AI, Vertex Model Monitoring
First	Second
Business use cases and product strategy	-
Cost optimization	-
Supporting the application design	-
Integration with external systems	-
Movement of data	-
Design decision trade-offs	-
Build, buy, modify, or deprecate	-
Success measurements (e.g., key performance indicators [KPI], return on investment [ROI], metrics)	-
Compliance and observability	-
First	Second
High availability and failover design	-
Elasticity of cloud resources with respect to quotas and limits	-
Scalability to meet growth requirements	-
Performance and latency	-
First	Second
Integration with on-premises/multi-cloud environments	-
Cloud-native networking (VPC, peering, firewalls, container networking)	-
Choosing data processing technologies	-
Choosing appropriate storage types (e.g., object, file, databases)	-
Choosing compute resources (e.g., preemptible, custom machine type, specialized workload)	-
Mapping compute needs to platform products	-
First	Second
Integrating solutions with existing systems	-
Migrating systems and data to support the solution	-
Software license mapping	-
Network planning	-
Testing and proofs of concept	-
Dependency management planning	-
First	Second
Cloud and technology improvements	-
Evolution of business needs	-
Evangelism and advocacy	-
First	Second
Extending to on-premises environments (hybrid networking)	-
Extending to a multi-cloud environment that may include Google Cloud to Google Cloud communication	-
Security protection (e.g. intrusion protection, access control, firewalls)	-
First	Second
Data storage allocation	-
Data processing/compute provisioning	-
Security and access management	-
Network configuration for data transfer and latency	-
Data retention and data life cycle management	-
Data growth planning	-
First	Second
Compute resource provisioning	-
Compute volatility configuration (preemptible vs. standard)	-
Network configuration for compute resources (Google Compute Engine, Google Kubernetes Engine, serverless networking)	-
Infrastructure orchestration, resource configuration, and patch management	-
Container orchestration	-
First	Second
Identity and access management (IAM)	-
Resource hierarchy (organizations, folders, projects)	-
Data security (key management, encryption, secret management)	-
Separation of duties (SoD)	-
Security controls (e.g., auditing, VPC Service Controls, context aware access, organization policy)	-
Managing customer-managed encryption keys with Cloud Key Management Service	-
Remote access	-