Hi Cecilia, about your questions: 1 - I mean an Endpoint, is like an "API" that you could use behind an API Gateway to invoke your model. The presumed behavior within an API I'ts to invoke and get the result (this will be the output). So, your Lambda could take the endpoint response and save to S3, put in a Queue or Topic, or even return to the API Gateway (assuming that you have an HTTP API that invokes this endpoint). About batch predictions, usually, you will use them for asynchronous inferences. For example, I have a project to predict dropouts. After all the classes ended, we trigger an asynchronous batch prediction job that will get all the students' data and call the inference model. Since we have thousands of student profiles, this endpoint will be called with batches of 1000 profiles for each batch and store the results in the S3 bucket. This action triggers the remaining pipeline steps which the new scores will be saved in our database.
For this project, this strategy is more cost-effective since w