HarshadRanganathan/Aurora RDS Performance Optimizations

## Aurora RDS Performance Optimizations
1. I see that most of your connections and queries are running on the master(writer) database instance itself without much load on the reader instance. You can consider dividing your workload such that read-only queries/workloads are directed to the reader instance, and only write queries are handled by your writer instance. This will help alleviate the large undo logs(RollbackSegmentHistoryListLength) due to long running queries and in itself this should mitigate a lot of the performance issues.

	One way to achieve splitting of Reads and Writes is by making use of a third party software Proxy solution which can split reads and writes to the appropriate endpoints. Below are a few example software solutions which you can consider:
	[+] ProxySQL - https://proxysql.com/
	[+] Heimdall Data - https://www.heimdalldata.com/


2. If and where possible, try to split large transactions into multiple smaller transactions. This will again reduce the growth of the undo log which seems to be the main cause of the slowness that you are facing.

3. I can see that you have set aurora_parallel_query to off in the parameter group attached to your instances. The Aurora parallel query feature can improve speeds for large select queries, and you can consider testing and enabling the same for your read queries on your reader instance. If you see any improvement for your long running read queries in your test environment using aurora parallel query, you can then enable this feature on your production database instances as well.

	Note - Aurora parallel query can only be enabled via the parameter group from Aurora MySQL version 2.09 onwards. For Aurora versions prior to 2.09, parallel query can not be enabled dynamically and you will have to select this feature while creating(or restoring) the cluster.
	[+] Working with parallel query for Amazon Aurora MySQL - https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-mysql-parallel-query.html

Further, the below MySQL documentations provides additional pointers on how you can optimize transaction management and select queries when using InnoDB.
	[+] Optimizing InnoDB Transaction Management - https://dev.mysql.com/doc/refman/5.7/en/optimizing-innodb-transaction-management.html

4- If you are expecting such load on writer instance then you should scale up the instance to serve the business, You can learn about the available instance and the hardware specs associated from the below link:

	[+] https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.DBInstanceClass.html#Concepts.DBInstanceClass.Summary

	and for scaling the instance you can follow the below link:

	[+] https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Overview.DBInstance.Modifying.html

5- Enabling Performance Insights[2], With the Performance Insights dashboard, you can visualize the database load and filter the load by waits, SQL statements, hosts, or users which will help you to invest in any enhancing opportunity for your queries.


6 - In a properly tuned application, the History List Length is around 5,000 and below. This is compared to a normal and acceptable value of 5,000. This extremely high value has alot of performance implications on the RDS instance ( eg CPU usage increase).

You can confirm the findings regarding the Rollback Segment History by running this command;

select name, count from information_schema.INNODB_METRICS where name like '%hist%';

The attached charts also show  the Rollback Segment History.


Another reason for the increase of the history length are hung transactions. Hung transactions are detected when when the InnoDB transaction history start growing. When you check the MySQL’s process list shows those transactions in “Sleep” state. Sleeping transactions are discussed in detail at a later part of this correspondence.  It turns out that those transactions were “lost” or “hung”. As we can also see, each of those transactions holds two lock structures and one undo record, so they are not committed and not rolled-back. They are sitting there doing nothing. In this case, with the default isolation level REPEATABLE-READ, InnoDB can’t purge the undo records (transaction history) for other transactions until these “hung” transactions are finished.

In a perfect world, an application will open a session, execute a transaction, then close and commit the transaction all at once. However, many applications have been found to "forget" to clean up these transactions keeping the sessions and the underlying transactions open for hours, days, and in the most extreme cases, weeks at a time.


The result of this behavior, causes the subsequent transactions that execute against these tables, to not only have read the data from the table, but also, have to read undo pages from every transaction that has been performed on the table since the original uncommitted transaction was opened. These undo logs also require storage space for holding the transaction records for data integrity.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Finding the queries from the hung transactions and long transactions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


There are a number of options to find the queries from that “hung” transaction. In older MySQL versions, the only way is to enable the general log (or the slow query log). Starting with MySQL 5.6, we can use the Performance Schema. Here are the steps:

1. Enable performance_schema if not enabled (it is disabled on RDS / Aurora by default ).

2. Enable events_statements_history:

MySQL
mysql> update performance_schema.setup_consumers set ENABLED = 'YES' where NAME='events_statements_history';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

3. Run the query to find all transaction started 10 seconds ago (change the number of seconds to match your workload):


SELECT ps.id as processlist_id,
trx_started, trx_isolation_level,
esh.EVENT_ID,
esh.TIMER_WAIT,
esh.event_name as EVENT_NAME,
esh.sql_text as SQL_TEXT,
esh.RETURNED_SQLSTATE, esh.MYSQL_ERRNO, esh.MESSAGE_TEXT, esh.ERRORS, esh.WARNINGS
FROM information_schema.innodb_trx trx
JOIN information_schema.processlist ps ON trx.trx_mysql_thread_id = ps.id
LEFT JOIN performance_schema.threads th ON th.processlist_id = trx.trx_mysql_thread_id
LEFT JOIN performance_schema.events_statements_history esh ON esh.thread_id = th.thread_id
WHERE trx.trx_started < CURRENT_TIME - INTERVAL 10 SECOND
AND ps.USER != 'SYSTEM_USER'
ORDER BY esh.EVENT_ID;


Now we can see the list of queries from the old transaction (the MySQL query used was taken with modifications from this blog post: Tracking MySQL query history in long running transactions). [1][1a]


At this point, we can chase this issue at the application level and find out why this transaction was not committed. The typical causes:

- There is a heavy, non-database-related process inside the application code. For example, the application starts a transaction to get a list of images for analysis and then starts an external application to process those images (machine learning or similar), which can take a very long time.

- The application got an uncaught exception and exited, but the connection to MySQL was not closed for some reason (i.e., returned to the connection pool).

We can also try to configure the timeouts on MySQL or the application so that the connections are closed after “N” minutes.

~~~~~~~~~~~~~~~~~~~~~~~~~
Sleeping Task
~~~~~~~~~~~~~~~~~~~~~~~~~

The sleeping task can still consume resources [2].


You can also check the number of tasks that aren't in use (sleeping tasks). These tasks can lead to increased memory resource (RAM, cache, and "processor") consumption, which can slow down the server. These is the reason why you have the RDS CPU usage. It's a best practice to tune your application to gracefully close the connections that aren't in use. You can also modify the values for the wait_timeout and interactive_timeout parameters to close the connection based on the value you set. For more information, see the MySQL Documentation for wait_timeout and interactive_timeout [3][4].


In your process list, you can see sleeping task with the state "Cleaned up". In Aurora, the threading model has been re-implemented in comparison with the conventional MySQL. In Aurora, we can free up the worker thread tied to a connection while the response is being sent to the client, so that it can start doing other work.

The message "Delayed commit ok done" and a similar message "Delayed send ok done", are actually idle states, and the threads have returned back after the operation. In short, either the client is keeping the connections open, or it is not closing the connections properly. Even if you run a simple query, it will show the similar message as long as the inserts are not waiting for locks or any lock message. These are normal expected Aurora threads and informational in nature.

There's an additional "Cleaned up" [5] message which you might encounter as well from the processlist. It is the final state of a connection whose work is complete but which has not been closed from the client side. In MySQL, this field is left blank (no State) in the same circumstance. I'll touch a little more on how these three states differ[6]:


1. Delayed send ok done - It is an idle state after a WRITE operation, but it has not been COMMIT-ed yet. It implies that the asynchronous ACK back to the client has been completed.

2. Delayed commit ok done - It is an idle state after COMMIT.

3. Cleaned up - It is an an idle state after a READ operation has been completed like running a SELECT query. It has completed its work but the connection is still open.

These threads will be removed after a certain amount of time. I would suggest you to close the connections explicitly. I can see that you are using the custom pes-stage-provisioned-db-rds.  I checked the wait_timeout  is set to null which is too high. I would recommend to tune the value for  wait_timeout  [2].  I have also attached a reference to the MySQL thread states [6].


The following are the symptoms when the HLL is high.

- Aurora Reader instance restarts due to "Fall Behind": When a very large/old transaction is completed, the purge thread will start working at a very high pace in the writer to clean up. Aurora readers might not be able to catch up to the added write throughput. (Especially, if a reader was the one holding the old read view - its own extra cleanup work might make it fall behind/lag too much).


- "Missing/Delayed" data in Aurora readers: A customer using an old read view on a reader to evaluate whether data recently modified on the writer is available on the reader will not see the data, therefore being led to believe that there is a problem of missing data or greatly delayed replication between the writer and that reader. The simple solution for this situation is to just close the client session and open a new one, the data will be there.


- Higher CPU usage: Because transactions might have to, for each row they read from a table, read a chain of undo records, and these are likely in memory, additional CPU will be required to access the several versions of each row, instead of accessing only the current one. One has noticed the high CPU usage when the HLL is high on 165,124  at 2022-01-05T17:52 UTC.


- Slower SELECT statements (higher transactional latency): The older a transaction is, and the more other transactions have modified data that the old transaction wants to access, the larger the undo chains will be... slowing down even more the older transaction.

- Inconsistent SELECT statement timing: Please note, there will be no change or difference on the execution plan of the SQL statement. This can lead to the same SELECT performing very well when the HLL is low, and performing much slower when the HLL is high. Same SQL, same execution plan, different times. Therefore, to ensure a true benchmark, please check that the HLL is low when taking timing.

- Higher memory usage: For the same reasons above, a session traversing all this additional data will require more memory as well.

- Aurora Reader instance restarts due to "Fall Behind": When a very large/old transaction is completed, the purge thread will start working at a very high pace in the writer to clean up. Aurora readers might not be able to catch up to the added write throughput. (Especially, if a reader was the one holding the old read view - its own extra cleanup work might make it fall behind/lag too much).


- Crashes and Failovers: In extreme cases, lead to crashes and/or failovers.


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Aurora specific troubleshooting regarding HLL
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

For HLL/MVCC issues, an Aurora cluster operates as a single entity. While InnoDB is only write enabled on the writer instance, each reader can open transactions (read views), and a transaction open on the reader will block the writer's purge process, therefore increasing the HLL. One is hence correct to note that the Aurora Cluster is using the same storage for both reader node  and writer node.

So, one needs to be careful about open transactions / long running statements in all instances (eg. including the reader) [7].

Instead of connecting to each reader and checking transactions individually, to make it easier, the customer can use this select on the writer, to identify if one of the readers is holding an old read view (as compared to the engine InnoDB status on the writer):


SELECT server_id, IF(session_id = 'master_session_id', 'writer', 'reader') AS ROLE, replica_lag_in_msec,
       oldest_read_view_trx_id , oldest_read_view_lsn
       from mysql.ro_replica_status;

Lastly, high HLL, if caused by a reader, can cause additional memory usage on that reader, and when the reader closes the read view, can cause that reader to fall behind and be restarted.


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Preventing HLL (History List Lenght )
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


The on-site DBA should always be monitoring their databases with regards to HLL - in MySQL, through the SHOW ENGINE INNODB STATUS output, or through this select:


select NAME, COUNT from INFORMATION_SCHEMA.INNODB_METRICS where NAME = 'trx_rseg_history_len';


If the HLL grows, they can take the same steps documented in MySQL, to close any old MVCC snapshot (AKA read view). An efficient means to do that is to ensure that both: there are no long-running (hours) selects, and there are no long open transactions (not even idle ones).

This select can be run on each instance:

SELECT a.trx_id, a.trx_state, a.trx_started,
      TIMESTAMPDIFF(SECOND,a.trx_started, now()) as "Seconds Transaction Has Been Open",
      a.trx_rows_modified, b.USER, b.host, b.db, b.command, b.time, b.state
      from information_schema.innodb_trx a, information_schema.processlist b
      where a.trx_mysql_thread_id=b.id
      order by trx_started;

Once identified the session/transaction that has the oldest TRX_ID, the on-site DBA needs to evaluate whether to, or when to kill that session, to unblock the purge operation.


I hope above information helps. However, if you have any questions or concerns related to this issue please let me know, as here at AWS we are always happy to assist you the best possible way.


A gentle note that my Sydney (Australia)  based work shift is from Tuesday to Saturday 7 AM till 3PM AEST.

If you need urgent assistance, please initiate a call or chat from the AWS Support console and an AWS Support engineer will be able to assist you immediately.

Thank you, please always be in good health and have a blessed day ahead.


[1]  https://gist.github.com/djpentz/2482b5bf4b9682a7bb455dce5f1f79f8
[1a] https://www.percona.com/blog/2017/05/08/chasing-a-hung-transaction-in-mysql-innodb-history-length-strikes-back/
[2] https://aws.amazon.com/premiumsupport/knowledge-center/rds-instance-high-cpu/
[3] https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_wait_timeout
[4] https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_interactive_timeout
[5] https://forums.aws.amazon.com/message.jspa?messageID=707334
[6] https://dev.mysql.com/doc/refman/5.7/en/general-thread-states.html
[7] https://www.percona.com/blog/2017/05/08/chasing-a-hung-transaction-in-mysql-innodb-history-length-strikes-back/


## Aurora Serverless Performance
Regarding why the write operation was of poor throughput it would depend on the amount of data that would be inserted by the operation along with  any deadlocks or wait events waiting for a lock it may need to take hold off as well.  To that end if possible you could run those write operations to check for any deadlocks(by checking the output of the show engine innodb status) they may be causing , further you can also run profile on them to see what stage of the execution is slowing down the whole query. The explain plan will also provide us with insights on the query execution plan allowing you to make changes to improve efficiency.

https://dev.mysql.com/doc/refman/5.7/en/show-profile.html
https://dev.mysql.com/doc/refman/5.7/en/show-engine.html
https://dev.mysql.com/doc/refman/5.7/en/explain.html


>>Do we have to look at using managed Aurora cluster with read replicas, connection pooling for better throughput?

It is indeed an alternative that you can consider, your workload contains both read and write traffic therefore with a provisioned aurora cluster you will have a writer and readers to split the write and read workloads . This along with connection polling can indeed increase the overall throughput for your database. You can try to restore a snapshot of your cluster to a provisioned one and test against your workload . Further provisioned clusters come with Enhanced monitoring and perfromance insights which give us more insights on the resource utilisation and query performance.

https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_PerfInsights.html
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_Monitoring.OS.html

## Aurora Serverless v1 Scaling
 a) What are the factors used for determining Scaling in terms of CPU and connections ?

  Yes, the Aurora Scaling depends upon the CPUUtilization and the number of Application Connections. Aurora Serverless v1 scales up when capacity constraints are seen in CPU or connections. Aurora automatically scales up when it detects performance issues that can be resolved by scaling up.

  The capacity allocated to your Aurora Serverless v1 DB cluster seamlessly scales up and down based on the load generated by your client application. Here, load is  CPU utilization and number of connections. Aurora Serverless v1 can also scale to zero capacity when there are no connections if you enable the pause-and-resume option for your DB cluster's capacity settings.

  For more information on how Aurora Serverless v1 works, please refer to the following link.

  https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-serverless.how-it-works.html


b) What is the max connections per ACU?

   The maximum number of simultaneous database connections varies by the memory allocation for the DB instance class. For MariaDB and MySQL, this is the
  formula "{DBInstanceClassMemory/12582880}" used to calculate the number of connections.

 Yes, the number of connection values which you have mentioned are correct.

   For more information, please refer to the following link.

   https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_Limits.html#RDS_Limits.MaxConnections

c) Inferring write throughput metric. It shows value as ~1000 so does it mean writing 1000 records per sec?

  Could you please let me know if you are referring to a document and it mentioned about the WriteThroughput ~ 1000.
  In General, WriteThroughput is the average number of bytes written to disk and it is measured in bytes per second.


ACUs	Memory (in GB)	Max Connections
1	2	90
2	4	180 (up from 90)
4	8	270 (up from 135)
8	16	1,000
16	32	2,000
32	64	3,000
64	122	4,000
128	244	5,000
256	488	6,000

## Cognito LB Auth Integration
Multi-Tenancy
==================

From your case description, I understand that you have created two app clients in the user pool, which are further integrated with two different Load balancers. When the user try to log in to one application, he is also able to login into other application with any ask for login(SSO experience).

In regard to your case, the implementation performed is called "Multi Tenancy Support". Cognito User Pool represents a single tenant, users in a user pool belong to the same directory and share the same settings like password policy, custom attributes, MFA settings, advanced security settings …etc.

In your case, the approach of Same user pool, multiple app clients is used. Here, single user pool is used to host all users and use app client to represent tenants. This is easier to maintain, but tenants share the same settings. This approach requires additional considerations when hosted UI is used to authenticate users with native accounts; e.g. username and password. When hosted UI is in-use, a session cookie is created to maintain session for the authenticated user from cognito end, and it provides SSO experience between "application clients" in the same user pool, if SSO is not the desired behavior in your application, hosted UI shouldn’t be used with this approach to authenticate native accounts.

The cons of using same user pool, multiple app clients approach is:
- It would Require you to perform tenant match logic on client side through CustomUI to determine which app client to authenticate users against
- It would also require additional Auth logic to verify that user belongs to this tenant (since all users share one pool, it is technically possible that users authenticate against any app client)

The possible workaround in this case at the moment would be to use different user-pools for the purpose. Later, you can move ahead with the approach of using custom UI to implement the tenant match logic.


Too Many Redirects
=====================

Known issue with Application Load Balancer (ALB) where the ALB in first response, sent 2 cookie fragments, say *-0 and *-1 and the client’s browser stored them. In the subsequent request, client is sending a request with 2 fragments, but this time the cookie size was less, so ALB only created 1 fragment, say *-0 and sent in the response.
In the next request, the client’s browser has only updated the value of *-0 cookie, but the value of *-1 is stale, and it sends both fragments instead of just 1 (which is the latest one).
The ALB then throws decrypt error, as it cannot decrypt that cookie.

I checked with internal team and understand that this is a known issue to the internal team and they are currently working on fixing this.

As a workaround currently we cannot do much for now other than clearing the cookies everytime.

## EKS ALB Traffic Routing - Multi AZ
We want to understand how ALB traffic routing takes place in EKS context.

Assume that we have a 3 node Multi-AZ EKS cluster in us-east-1-region.

Node 1 - us-east-1a
Node 2 - us-east-1b
Node 3 - us-east-1c

We have created an ALB in instance mode for a Kubernetes service, which means that the ALB has targets to the 3 instance nodes rather than the pods themselves.

Case 1:
We have 3 pods mapped to the Kubernetes service and each node has one of the pods running.

When a request is sent to ALB from us-east-1a region, does it always forward the traffic to the node in the same AZ as the loadbalancer?

Case 2:
We have only 1 pod mapped to the Kubernetes service and that pod is running in the us-east-1b node.

When a request is sent to ALB from us-east-1a region, does it send the traffic to us-east-1b node (or) it sends to us-east-1a node but then kubernetes forwards the traffic to us-east-1b node as pod-pod communication traffic?


Answer:
=======================

The default setting for externalTrafficPolicy is “Cluster,” which allows every worker node in the cluster to accept traffic for every service no matter if a pod for that service is running on the node or not. Traffic is then forwarded on to a node running the service via kube-proxy.
This is typically fine for smaller or single AZ clusters but when you start to scale your instances it will mean more instances will be backends for a service and the traffic is more likely to have an additional hop before it arrives at the instance running the container it wants.

When running services that span multiple AZs, you should consider setting the externalTrafficPolicy in your service to help reduce cross AZ traffic.
By setting externalTrafficPolicy to Local, instances that are running the service container will be load balancer backends, which will reduce the number of endpoints on the load balancer and the number of hops the traffic will need to take.

Another benefit of using the Local policy is you can preserve the source IP from the request. As the packets route through the load balancer to your instance and ultimately your service, the IP from the originating request can be preserved without an additional kube-proxy hop.

An example service object with externalTrafficPolicy set would look like this:
apiVersion: v1
kind: Service
metadata:
  name: example-service
spec:
  selector:
    app: example
  ports:
    - port: 8765
      targetPort: 9376
  externalTrafficPolicy: Local
  type: LoadBalancer

## ES Cluster RED Status
Index must deleted in order to bring the cluster status back to green.

The process to restore the correct index from a snapshot is seen below:

Identifying the red indices:
* GET _cat/indices?health=red

Run the following API call to know the snapshot repository name:
 *GET /-snapshot?pretty

Once, the snapshot repository name is identified, (Most cases it is 'cs-automated' or 'cs-automated-enc'), please run the following API call to list the snapshots.
* GET /_snapshot/repository/_all?pretty (Replace the 'repository' with your repository name)

Deleting the red index:
* DELETE /index-name. (Replace the 'index-name' with the index that you need to delete.)

Once, the snapshot name is identified from which you want to restore the deleted index, you can run the following API call to restore the index.
* POST /_snapshot/cs-automated//_restore { "indices": "index_name" }

(Replace the '' with the snapshot name that you have identified from which you want to restore the deleted index. Also, replace the "my-index" with the index name that you want to restore.)


For more info on Restoring Snapshots, See 'Restoring Snapshots' link below[1].
If you have any further questions, feel free to reach out to me and I will be happy to assist.

References:

[1] - 'Restoring Snapshots' https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/es-managedomains-snapshots.html#es-managedomains-snapshot-restore

## MSK Rebalance Error
Attempt to heartbeat failed since group is rebalancing
Revoke previously assigned partitions
(Re-)joining group
ending LeaveGroup request to coordinator (rack: null) due to consumer poll timeout has expired. This means the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time processing messages. You can address this either by increasing max.poll.interval.ms or by reducing the maximum size of batches returned in pol

Answer
===========

The way Kafka messages are consumed is that a consumer.poll() function call fetches a list of records from the Kafka topic, the consumer application then processes those records in a loop and does the next consumer.poll() call to fetch the next batch. The maximum permitted time between each poll function call is defined by the "max.poll.interval.ms" Kafka consumer configuration parameter (default to 300 seconds unless explicitly overridden). If the time between 2 consumer.poll() calls goes over this 5 minute mark, the consumer instance would leave the consumer group forcing the group co-ordinator to trigger a rebalance and redistribute the Kafka topic's partitions across all the other available consumer instances. This is an indication of slow processing logic or the Kafka records are being sent to a downstream application which is slow to respond, in turn increasing the overall time taken in the processing logic. In such cases, as the error message suggests it is advisable to:

1. Increase the value of "max.poll.interval.ms" to a higher value. This would help in accommodating sudden increases in record processing time and ensures that the consumer group does not enter into a rebalancing state.
2. Decrease the total number of records returned by Kafka in each poll cycle by tuning the "max.poll.records" (defaults to 500) consumer parameter. Although this might slow down the entire consumption process even when the processing logic is behaving normally and taking usual time to process records.

## RDS to S3 OUTFILE - Access Denied Error
The issue has been identified as the restrictive bucket policy on the target bucket, named "xxxx". The 2 specific rules which are causing the deny are "DenyIncorrectEncryptionHeader" and "DenyUnEncryptedObjectUploads". I have added these rules to my own S3 bucket and immediately my outfile operations failed with "Error Code: 63994. S3 API returned error: Access Denied:Access Denied".

As the outfile generated by MySQL is not an encrypted object, the above policy rules are denying the operation. Furthermore, as there is no option to create the outfile as an encrypted object, there are 2 options which come to mind.

1. Remove the above mentioned rules from the bucket policy. This would obviously depend on your organizations own policies and procedures.

2. Create a new bucket without the above mentioned rules in it's bucket policy.

## Route53 SplitView DNS
When we try to resolve the public hosted zone record "" from within a pod running in EKS cluster "" residing in a private subnet it results in "getaddrinfo ENOTFOUND".

Answer
============

In your Route53 configuration you have the same domain in the private and public hosted zones. This is called split-view DNS and is described in the documentation link below in details.
https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/hosted-zone-private-considerations.html

The idea is if there's a private hosted zone name that matches the domain name in the request, the hosted zone is searched for a record that matches the domain name and DNS type in the request.
And if there's a matching private hosted zone but there's no record that matches the domain name and type in the request, Resolver doesn't forward the request to a public DNS resolver. Instead, it returns NXDOMAIN (non-existent domain) to the client.

This explains the behaviour you are getting, and only the records in the private hosted zones will resolve from the VPC attached to that private zone.

In order to overcome this, I would advise to add the records you need in the private zone like in the public zone.

## S3 Cross Account Permissions
We are trying to do cross account replication between two AWS accounts (Account A to Account B).

We have provided the required permissions to IAM role in source account and the replication permissions in the destination bucket policy.

But the replication is in failed status. I have enabled server access logging in the source bucket and could see the replication is successfully getting the new uploaded object.


Answer
==============

I found that the “Owner Override to Destination” [1] is enabled on the replication rule and the destination bucket policy doesn’t have the policy that allows this, below is an example:

{
    "Sid":"1",
    "Effect":"Allow",
    "Principal":{"AWS":"source-bucket-account-id"},
    "Action":["s3:ObjectOwnerOverrideToBucketOwner"],
    "Resource":"arn:aws:s3:::destination-bucket/*"
}

And the IAM role doesn’t have the "s3:ObjectOwnerOverrideToBucketOwner" permissions for the destination bucket.

Below are the answers to your questions:

1.	Source bucket has Amazon S3 master-key (SSE-S3) encryption enabled. Will this be carried over to destination bucket?

Yes, for objects that are SSE-S3 encrypted in the source bucket, the replica in the destination bucket will also have SSE-S3 encryption enabled.

2.	Is enabling AWS KMS key for encrypting destination objects arn:aws:kms:us-east-1::alias/aws/s3 in replication same as Amazon S3 master-key (SSE-S3) encryption?

No, the AWS KMS key [2] and SSE-S3 [3] are different types of encryption. Amazon S3 key (SSE-S3) is an encryption key that S3 creates, manages and uses for you and AWS Key management key (SSE-KMS) is an encryption key protected by the AWS key management service.  Please note that the IAM role doesn’t have kms permissions, so it is not allowed to use the kms key: arn:aws:kms:us-east-1::alias/aws/s3, so replication for KMS encrypted objects will fail.

3.	Will destination account use source account SSE key after replication?

The replicated objects will have SSE-S3 encrypted enabled, but this not the same key as the source bucket’s SSE-S3 key, since they are in different AWS accounts.

For more information please refer to the links below:

[1] https://docs.aws.amazon.com/AmazonS3/latest/userguide/replication-change-owner.html#repl-ownership-add-role-permission
[2] https://docs.aws.amazon.com/kms/latest/developerguide/concepts.html#aws-managed-cmk
[3] https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingServerSideEncryption.html
	1. I see that most of your connections and queries are running on the master(writer) database instance itself without much load on the reader instance. You can consider dividing your workload such that read-only queries/workloads are directed to the reader instance, and only write queries are handled by your writer instance. This will help alleviate the large undo logs(RollbackSegmentHistoryListLength) due to long running queries and in itself this should mitigate a lot of the performance issues.

	One way to achieve splitting of Reads and Writes is by making use of a third party software Proxy solution which can split reads and writes to the appropriate endpoints. Below are a few example software solutions which you can consider:
	[+] ProxySQL - https://proxysql.com/
	[+] Heimdall Data - https://www.heimdalldata.com/


	2. If and where possible, try to split large transactions into multiple smaller transactions. This will again reduce the growth of the undo log which seems to be the main cause of the slowness that you are facing.

	3. I can see that you have set aurora_parallel_query to off in the parameter group attached to your instances. The Aurora parallel query feature can improve speeds for large select queries, and you can consider testing and enabling the same for your read queries on your reader instance. If you see any improvement for your long running read queries in your test environment using aurora parallel query, you can then enable this feature on your production database instances as well.

	Note - Aurora parallel query can only be enabled via the parameter group from Aurora MySQL version 2.09 onwards. For Aurora versions prior to 2.09, parallel query can not be enabled dynamically and you will have to select this feature while creating(or restoring) the cluster.
	[+] Working with parallel query for Amazon Aurora MySQL - https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-mysql-parallel-query.html

	Further, the below MySQL documentations provides additional pointers on how you can optimize transaction management and select queries when using InnoDB.
	[+] Optimizing InnoDB Transaction Management - https://dev.mysql.com/doc/refman/5.7/en/optimizing-innodb-transaction-management.html

	4- If you are expecting such load on writer instance then you should scale up the instance to serve the business, You can learn about the available instance and the hardware specs associated from the below link:

	[+] https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.DBInstanceClass.html#Concepts.DBInstanceClass.Summary

	and for scaling the instance you can follow the below link:

	[+] https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Overview.DBInstance.Modifying.html

	5- Enabling Performance Insights[2], With the Performance Insights dashboard, you can visualize the database load and filter the load by waits, SQL statements, hosts, or users which will help you to invest in any enhancing opportunity for your queries.


	6 - In a properly tuned application, the History List Length is around 5,000 and below. This is compared to a normal and acceptable value of 5,000. This extremely high value has alot of performance implications on the RDS instance ( eg CPU usage increase).

	You can confirm the findings regarding the Rollback Segment History by running this command;

	select name, count from information_schema.INNODB_METRICS where name like '%hist%';

	The attached charts also show the Rollback Segment History.


	Another reason for the increase of the history length are hung transactions. Hung transactions are detected when when the InnoDB transaction history start growing. When you check the MySQL’s process list shows those transactions in “Sleep” state. Sleeping transactions are discussed in detail at a later part of this correspondence. It turns out that those transactions were “lost” or “hung”. As we can also see, each of those transactions holds two lock structures and one undo record, so they are not committed and not rolled-back. They are sitting there doing nothing. In this case, with the default isolation level REPEATABLE-READ, InnoDB can’t purge the undo records (transaction history) for other transactions until these “hung” transactions are finished.

	In a perfect world, an application will open a session, execute a transaction, then close and commit the transaction all at once. However, many applications have been found to "forget" to clean up these transactions keeping the sessions and the underlying transactions open for hours, days, and in the most extreme cases, weeks at a time.


	The result of this behavior, causes the subsequent transactions that execute against these tables, to not only have read the data from the table, but also, have to read undo pages from every transaction that has been performed on the table since the original uncommitted transaction was opened. These undo logs also require storage space for holding the transaction records for data integrity.

	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	Finding the queries from the hung transactions and long transactions
	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


	There are a number of options to find the queries from that “hung” transaction. In older MySQL versions, the only way is to enable the general log (or the slow query log). Starting with MySQL 5.6, we can use the Performance Schema. Here are the steps:

	1. Enable performance_schema if not enabled (it is disabled on RDS / Aurora by default ).

	2. Enable events_statements_history:

	MySQL
	mysql> update performance_schema.setup_consumers set ENABLED = 'YES' where NAME='events_statements_history';
	Query OK, 1 row affected (0.00 sec)
	Rows matched: 1 Changed: 1 Warnings: 0

	3. Run the query to find all transaction started 10 seconds ago (change the number of seconds to match your workload):


	SELECT ps.id as processlist_id,
	trx_started, trx_isolation_level,
	esh.EVENT_ID,
	esh.TIMER_WAIT,
	esh.event_name as EVENT_NAME,
	esh.sql_text as SQL_TEXT,
	esh.RETURNED_SQLSTATE, esh.MYSQL_ERRNO, esh.MESSAGE_TEXT, esh.ERRORS, esh.WARNINGS
	FROM information_schema.innodb_trx trx
	JOIN information_schema.processlist ps ON trx.trx_mysql_thread_id = ps.id
	LEFT JOIN performance_schema.threads th ON th.processlist_id = trx.trx_mysql_thread_id
	LEFT JOIN performance_schema.events_statements_history esh ON esh.thread_id = th.thread_id
	WHERE trx.trx_started < CURRENT_TIME - INTERVAL 10 SECOND
	AND ps.USER != 'SYSTEM_USER'
	ORDER BY esh.EVENT_ID;



	Now we can see the list of queries from the old transaction (the MySQL query used was taken with modifications from this blog post: Tracking MySQL query history in long running transactions). [1][1a]




	At this point, we can chase this issue at the application level and find out why this transaction was not committed. The typical causes:

	- There is a heavy, non-database-related process inside the application code. For example, the application starts a transaction to get a list of images for analysis and then starts an external application to process those images (machine learning or similar), which can take a very long time.

	- The application got an uncaught exception and exited, but the connection to MySQL was not closed for some reason (i.e., returned to the connection pool).

	We can also try to configure the timeouts on MySQL or the application so that the connections are closed after “N” minutes.

	~~~~~~~~~~~~~~~~~~~~~~~~~
	Sleeping Task
	~~~~~~~~~~~~~~~~~~~~~~~~~

	The sleeping task can still consume resources [2].



	You can also check the number of tasks that aren't in use (sleeping tasks). These tasks can lead to increased memory resource (RAM, cache, and "processor") consumption, which can slow down the server. These is the reason why you have the RDS CPU usage. It's a best practice to tune your application to gracefully close the connections that aren't in use. You can also modify the values for the wait_timeout and interactive_timeout parameters to close the connection based on the value you set. For more information, see the MySQL Documentation for wait_timeout and interactive_timeout [3][4].


	In your process list, you can see sleeping task with the state "Cleaned up". In Aurora, the threading model has been re-implemented in comparison with the conventional MySQL. In Aurora, we can free up the worker thread tied to a connection while the response is being sent to the client, so that it can start doing other work.

	The message "Delayed commit ok done" and a similar message "Delayed send ok done", are actually idle states, and the threads have returned back after the operation. In short, either the client is keeping the connections open, or it is not closing the connections properly. Even if you run a simple query, it will show the similar message as long as the inserts are not waiting for locks or any lock message. These are normal expected Aurora threads and informational in nature.

	There's an additional "Cleaned up" [5] message which you might encounter as well from the processlist. It is the final state of a connection whose work is complete but which has not been closed from the client side. In MySQL, this field is left blank (no State) in the same circumstance. I'll touch a little more on how these three states differ[6]:


	1. Delayed send ok done - It is an idle state after a WRITE operation, but it has not been COMMIT-ed yet. It implies that the asynchronous ACK back to the client has been completed.

	2. Delayed commit ok done - It is an idle state after COMMIT.

	3. Cleaned up - It is an an idle state after a READ operation has been completed like running a SELECT query. It has completed its work but the connection is still open.

	These threads will be removed after a certain amount of time. I would suggest you to close the connections explicitly. I can see that you are using the custom pes-stage-provisioned-db-rds. I checked the wait_timeout is set to null which is too high. I would recommend to tune the value for wait_timeout [2]. I have also attached a reference to the MySQL thread states [6].



	The following are the symptoms when the HLL is high.

	- Aurora Reader instance restarts due to "Fall Behind": When a very large/old transaction is completed, the purge thread will start working at a very high pace in the writer to clean up. Aurora readers might not be able to catch up to the added write throughput. (Especially, if a reader was the one holding the old read view - its own extra cleanup work might make it fall behind/lag too much).



	- "Missing/Delayed" data in Aurora readers: A customer using an old read view on a reader to evaluate whether data recently modified on the writer is available on the reader will not see the data, therefore being led to believe that there is a problem of missing data or greatly delayed replication between the writer and that reader. The simple solution for this situation is to just close the client session and open a new one, the data will be there.


	- Higher CPU usage: Because transactions might have to, for each row they read from a table, read a chain of undo records, and these are likely in memory, additional CPU will be required to access the several versions of each row, instead of accessing only the current one. One has noticed the high CPU usage when the HLL is high on 165,124 at 2022-01-05T17:52 UTC.



	- Slower SELECT statements (higher transactional latency): The older a transaction is, and the more other transactions have modified data that the old transaction wants to access, the larger the undo chains will be... slowing down even more the older transaction.

	- Inconsistent SELECT statement timing: Please note, there will be no change or difference on the execution plan of the SQL statement. This can lead to the same SELECT performing very well when the HLL is low, and performing much slower when the HLL is high. Same SQL, same execution plan, different times. Therefore, to ensure a true benchmark, please check that the HLL is low when taking timing.

	- Higher memory usage: For the same reasons above, a session traversing all this additional data will require more memory as well.

	- Aurora Reader instance restarts due to "Fall Behind": When a very large/old transaction is completed, the purge thread will start working at a very high pace in the writer to clean up. Aurora readers might not be able to catch up to the added write throughput. (Especially, if a reader was the one holding the old read view - its own extra cleanup work might make it fall behind/lag too much).


	- Crashes and Failovers: In extreme cases, lead to crashes and/or failovers.


	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	Aurora specific troubleshooting regarding HLL
	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

	For HLL/MVCC issues, an Aurora cluster operates as a single entity. While InnoDB is only write enabled on the writer instance, each reader can open transactions (read views), and a transaction open on the reader will block the writer's purge process, therefore increasing the HLL. One is hence correct to note that the Aurora Cluster is using the same storage for both reader node and writer node.

	So, one needs to be careful about open transactions / long running statements in all instances (eg. including the reader) [7].

	Instead of connecting to each reader and checking transactions individually, to make it easier, the customer can use this select on the writer, to identify if one of the readers is holding an old read view (as compared to the engine InnoDB status on the writer):


	SELECT server_id, IF(session_id = 'master_session_id', 'writer', 'reader') AS ROLE, replica_lag_in_msec,
	oldest_read_view_trx_id , oldest_read_view_lsn
	from mysql.ro_replica_status;

	Lastly, high HLL, if caused by a reader, can cause additional memory usage on that reader, and when the reader closes the read view, can cause that reader to fall behind and be restarted.


	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	Preventing HLL (History List Lenght )
	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~



	The on-site DBA should always be monitoring their databases with regards to HLL - in MySQL, through the SHOW ENGINE INNODB STATUS output, or through this select:


	select NAME, COUNT from INFORMATION_SCHEMA.INNODB_METRICS where NAME = 'trx_rseg_history_len';


	If the HLL grows, they can take the same steps documented in MySQL, to close any old MVCC snapshot (AKA read view). An efficient means to do that is to ensure that both: there are no long-running (hours) selects, and there are no long open transactions (not even idle ones).

	This select can be run on each instance:

	SELECT a.trx_id, a.trx_state, a.trx_started,
	TIMESTAMPDIFF(SECOND,a.trx_started, now()) as "Seconds Transaction Has Been Open",
	a.trx_rows_modified, b.USER, b.host, b.db, b.command, b.time, b.state
	from information_schema.innodb_trx a, information_schema.processlist b
	where a.trx_mysql_thread_id=b.id
	order by trx_started;

	Once identified the session/transaction that has the oldest TRX_ID, the on-site DBA needs to evaluate whether to, or when to kill that session, to unblock the purge operation.



	I hope above information helps. However, if you have any questions or concerns related to this issue please let me know, as here at AWS we are always happy to assist you the best possible way.


	A gentle note that my Sydney (Australia) based work shift is from Tuesday to Saturday 7 AM till 3PM AEST.

	If you need urgent assistance, please initiate a call or chat from the AWS Support console and an AWS Support engineer will be able to assist you immediately.

	Thank you, please always be in good health and have a blessed day ahead.


	[1] https://gist.github.com/djpentz/2482b5bf4b9682a7bb455dce5f1f79f8
	[1a] https://www.percona.com/blog/2017/05/08/chasing-a-hung-transaction-in-mysql-innodb-history-length-strikes-back/
	[2] https://aws.amazon.com/premiumsupport/knowledge-center/rds-instance-high-cpu/
	[3] https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_wait_timeout
	[4] https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_interactive_timeout
	[5] https://forums.aws.amazon.com/message.jspa?messageID=707334
	[6] https://dev.mysql.com/doc/refman/5.7/en/general-thread-states.html
	[7] https://www.percona.com/blog/2017/05/08/chasing-a-hung-transaction-in-mysql-innodb-history-length-strikes-back/
	Regarding why the write operation was of poor throughput it would depend on the amount of data that would be inserted by the operation along with any deadlocks or wait events waiting for a lock it may need to take hold off as well. To that end if possible you could run those write operations to check for any deadlocks(by checking the output of the show engine innodb status) they may be causing , further you can also run profile on them to see what stage of the execution is slowing down the whole query. The explain plan will also provide us with insights on the query execution plan allowing you to make changes to improve efficiency.

	https://dev.mysql.com/doc/refman/5.7/en/show-profile.html
	https://dev.mysql.com/doc/refman/5.7/en/show-engine.html
	https://dev.mysql.com/doc/refman/5.7/en/explain.html


	>>Do we have to look at using managed Aurora cluster with read replicas, connection pooling for better throughput?

	It is indeed an alternative that you can consider, your workload contains both read and write traffic therefore with a provisioned aurora cluster you will have a writer and readers to split the write and read workloads . This along with connection polling can indeed increase the overall throughput for your database. You can try to restore a snapshot of your cluster to a provisioned one and test against your workload . Further provisioned clusters come with Enhanced monitoring and perfromance insights which give us more insights on the resource utilisation and query performance.

	https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_PerfInsights.html
	https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_Monitoring.OS.html
	a) What are the factors used for determining Scaling in terms of CPU and connections ?

	Yes, the Aurora Scaling depends upon the CPUUtilization and the number of Application Connections. Aurora Serverless v1 scales up when capacity constraints are seen in CPU or connections. Aurora automatically scales up when it detects performance issues that can be resolved by scaling up.

	The capacity allocated to your Aurora Serverless v1 DB cluster seamlessly scales up and down based on the load generated by your client application. Here, load is CPU utilization and number of connections. Aurora Serverless v1 can also scale to zero capacity when there are no connections if you enable the pause-and-resume option for your DB cluster's capacity settings.

	For more information on how Aurora Serverless v1 works, please refer to the following link.

	https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-serverless.how-it-works.html


	b) What is the max connections per ACU?

	The maximum number of simultaneous database connections varies by the memory allocation for the DB instance class. For MariaDB and MySQL, this is the
	formula "{DBInstanceClassMemory/12582880}" used to calculate the number of connections.

	Yes, the number of connection values which you have mentioned are correct.

	For more information, please refer to the following link.

	https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_Limits.html#RDS_Limits.MaxConnections

	c) Inferring write throughput metric. It shows value as ~1000 so does it mean writing 1000 records per sec?

	Could you please let me know if you are referring to a document and it mentioned about the WriteThroughput ~ 1000.
	In General, WriteThroughput is the average number of bytes written to disk and it is measured in bytes per second.


	ACUs Memory (in GB) Max Connections
	1 2 90
	2 4 180 (up from 90)
	4 8 270 (up from 135)
	8 16 1,000
	16 32 2,000
	32 64 3,000
	64 122 4,000
	128 244 5,000
	256 488 6,000
	Multi-Tenancy
	==================

	From your case description, I understand that you have created two app clients in the user pool, which are further integrated with two different Load balancers. When the user try to log in to one application, he is also able to login into other application with any ask for login(SSO experience).

	In regard to your case, the implementation performed is called "Multi Tenancy Support". Cognito User Pool represents a single tenant, users in a user pool belong to the same directory and share the same settings like password policy, custom attributes, MFA settings, advanced security settings …etc.

	In your case, the approach of Same user pool, multiple app clients is used. Here, single user pool is used to host all users and use app client to represent tenants. This is easier to maintain, but tenants share the same settings. This approach requires additional considerations when hosted UI is used to authenticate users with native accounts; e.g. username and password. When hosted UI is in-use, a session cookie is created to maintain session for the authenticated user from cognito end, and it provides SSO experience between "application clients" in the same user pool, if SSO is not the desired behavior in your application, hosted UI shouldn’t be used with this approach to authenticate native accounts.

	The cons of using same user pool, multiple app clients approach is:
	- It would Require you to perform tenant match logic on client side through CustomUI to determine which app client to authenticate users against
	- It would also require additional Auth logic to verify that user belongs to this tenant (since all users share one pool, it is technically possible that users authenticate against any app client)

	The possible workaround in this case at the moment would be to use different user-pools for the purpose. Later, you can move ahead with the approach of using custom UI to implement the tenant match logic.



	Too Many Redirects
	=====================

	Known issue with Application Load Balancer (ALB) where the ALB in first response, sent 2 cookie fragments, say -0 and -1 and the client’s browser stored them. In the subsequent request, client is sending a request with 2 fragments, but this time the cookie size was less, so ALB only created 1 fragment, say *-0 and sent in the response.
	In the next request, the client’s browser has only updated the value of -0 cookie, but the value of -1 is stale, and it sends both fragments instead of just 1 (which is the latest one).
	The ALB then throws decrypt error, as it cannot decrypt that cookie.

	I checked with internal team and understand that this is a known issue to the internal team and they are currently working on fixing this.

	As a workaround currently we cannot do much for now other than clearing the cookies everytime.
	We want to understand how ALB traffic routing takes place in EKS context.

	Assume that we have a 3 node Multi-AZ EKS cluster in us-east-1-region.

	Node 1 - us-east-1a
	Node 2 - us-east-1b
	Node 3 - us-east-1c

	We have created an ALB in instance mode for a Kubernetes service, which means that the ALB has targets to the 3 instance nodes rather than the pods themselves.

	Case 1:
	We have 3 pods mapped to the Kubernetes service and each node has one of the pods running.

	When a request is sent to ALB from us-east-1a region, does it always forward the traffic to the node in the same AZ as the loadbalancer?

	Case 2:
	We have only 1 pod mapped to the Kubernetes service and that pod is running in the us-east-1b node.

	When a request is sent to ALB from us-east-1a region, does it send the traffic to us-east-1b node (or) it sends to us-east-1a node but then kubernetes forwards the traffic to us-east-1b node as pod-pod communication traffic?



	Answer:
	=======================

	The default setting for externalTrafficPolicy is “Cluster,” which allows every worker node in the cluster to accept traffic for every service no matter if a pod for that service is running on the node or not. Traffic is then forwarded on to a node running the service via kube-proxy.
	This is typically fine for smaller or single AZ clusters but when you start to scale your instances it will mean more instances will be backends for a service and the traffic is more likely to have an additional hop before it arrives at the instance running the container it wants.

	When running services that span multiple AZs, you should consider setting the externalTrafficPolicy in your service to help reduce cross AZ traffic.
	By setting externalTrafficPolicy to Local, instances that are running the service container will be load balancer backends, which will reduce the number of endpoints on the load balancer and the number of hops the traffic will need to take.

	Another benefit of using the Local policy is you can preserve the source IP from the request. As the packets route through the load balancer to your instance and ultimately your service, the IP from the originating request can be preserved without an additional kube-proxy hop.

	An example service object with externalTrafficPolicy set would look like this:
	apiVersion: v1
	kind: Service
	metadata:
	name: example-service
	spec:
	selector:
	app: example
	ports:
	- port: 8765
	targetPort: 9376
	externalTrafficPolicy: Local
	type: LoadBalancer
	Index must deleted in order to bring the cluster status back to green.

	The process to restore the correct index from a snapshot is seen below:

	Identifying the red indices:
	* GET _cat/indices?health=red

	Run the following API call to know the snapshot repository name:
	*GET /-snapshot?pretty

	Once, the snapshot repository name is identified, (Most cases it is 'cs-automated' or 'cs-automated-enc'), please run the following API call to list the snapshots.
	* GET /_snapshot/repository/_all?pretty (Replace the 'repository' with your repository name)

	Deleting the red index:
	* DELETE /index-name. (Replace the 'index-name' with the index that you need to delete.)

	Once, the snapshot name is identified from which you want to restore the deleted index, you can run the following API call to restore the index.
	* POST /_snapshot/cs-automated//_restore { "indices": "index_name" }

	(Replace the '' with the snapshot name that you have identified from which you want to restore the deleted index. Also, replace the "my-index" with the index name that you want to restore.)


	For more info on Restoring Snapshots, See 'Restoring Snapshots' link below[1].
	If you have any further questions, feel free to reach out to me and I will be happy to assist.

	References:

	[1] - 'Restoring Snapshots' https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/es-managedomains-snapshots.html#es-managedomains-snapshot-restore
	Attempt to heartbeat failed since group is rebalancing
	Revoke previously assigned partitions
	(Re-)joining group
	ending LeaveGroup request to coordinator (rack: null) due to consumer poll timeout has expired. This means the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time processing messages. You can address this either by increasing max.poll.interval.ms or by reducing the maximum size of batches returned in pol

	Answer
	===========

	The way Kafka messages are consumed is that a consumer.poll() function call fetches a list of records from the Kafka topic, the consumer application then processes those records in a loop and does the next consumer.poll() call to fetch the next batch. The maximum permitted time between each poll function call is defined by the "max.poll.interval.ms" Kafka consumer configuration parameter (default to 300 seconds unless explicitly overridden). If the time between 2 consumer.poll() calls goes over this 5 minute mark, the consumer instance would leave the consumer group forcing the group co-ordinator to trigger a rebalance and redistribute the Kafka topic's partitions across all the other available consumer instances. This is an indication of slow processing logic or the Kafka records are being sent to a downstream application which is slow to respond, in turn increasing the overall time taken in the processing logic. In such cases, as the error message suggests it is advisable to:

	1. Increase the value of "max.poll.interval.ms" to a higher value. This would help in accommodating sudden increases in record processing time and ensures that the consumer group does not enter into a rebalancing state.
	2. Decrease the total number of records returned by Kafka in each poll cycle by tuning the "max.poll.records" (defaults to 500) consumer parameter. Although this might slow down the entire consumption process even when the processing logic is behaving normally and taking usual time to process records.
	The issue has been identified as the restrictive bucket policy on the target bucket, named "xxxx". The 2 specific rules which are causing the deny are "DenyIncorrectEncryptionHeader" and "DenyUnEncryptedObjectUploads". I have added these rules to my own S3 bucket and immediately my outfile operations failed with "Error Code: 63994. S3 API returned error: Access Denied:Access Denied".

	As the outfile generated by MySQL is not an encrypted object, the above policy rules are denying the operation. Furthermore, as there is no option to create the outfile as an encrypted object, there are 2 options which come to mind.

	1. Remove the above mentioned rules from the bucket policy. This would obviously depend on your organizations own policies and procedures.

	2. Create a new bucket without the above mentioned rules in it's bucket policy.
	When we try to resolve the public hosted zone record "" from within a pod running in EKS cluster "" residing in a private subnet it results in "getaddrinfo ENOTFOUND".

	Answer
	============

	In your Route53 configuration you have the same domain in the private and public hosted zones. This is called split-view DNS and is described in the documentation link below in details.
	https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/hosted-zone-private-considerations.html

	The idea is if there's a private hosted zone name that matches the domain name in the request, the hosted zone is searched for a record that matches the domain name and DNS type in the request.
	And if there's a matching private hosted zone but there's no record that matches the domain name and type in the request, Resolver doesn't forward the request to a public DNS resolver. Instead, it returns NXDOMAIN (non-existent domain) to the client.

	This explains the behaviour you are getting, and only the records in the private hosted zones will resolve from the VPC attached to that private zone.

	In order to overcome this, I would advise to add the records you need in the private zone like in the public zone.
	We are trying to do cross account replication between two AWS accounts (Account A to Account B).

	We have provided the required permissions to IAM role in source account and the replication permissions in the destination bucket policy.

	But the replication is in failed status. I have enabled server access logging in the source bucket and could see the replication is successfully getting the new uploaded object.


	Answer
	==============

	I found that the “Owner Override to Destination” [1] is enabled on the replication rule and the destination bucket policy doesn’t have the policy that allows this, below is an example:

	{
	"Sid":"1",
	"Effect":"Allow",
	"Principal":{"AWS":"source-bucket-account-id"},
	"Action":["s3:ObjectOwnerOverrideToBucketOwner"],
	"Resource":"arn:aws:s3:::destination-bucket/*"
	}

	And the IAM role doesn’t have the "s3:ObjectOwnerOverrideToBucketOwner" permissions for the destination bucket.

	Below are the answers to your questions:

	1. Source bucket has Amazon S3 master-key (SSE-S3) encryption enabled. Will this be carried over to destination bucket?

	Yes, for objects that are SSE-S3 encrypted in the source bucket, the replica in the destination bucket will also have SSE-S3 encryption enabled.

	2. Is enabling AWS KMS key for encrypting destination objects arn:aws:kms:us-east-1::alias/aws/s3 in replication same as Amazon S3 master-key (SSE-S3) encryption?

	No, the AWS KMS key [2] and SSE-S3 [3] are different types of encryption. Amazon S3 key (SSE-S3) is an encryption key that S3 creates, manages and uses for you and AWS Key management key (SSE-KMS) is an encryption key protected by the AWS key management service. Please note that the IAM role doesn’t have kms permissions, so it is not allowed to use the kms key: arn:aws:kms:us-east-1::alias/aws/s3, so replication for KMS encrypted objects will fail.

	3. Will destination account use source account SSE key after replication?

	The replicated objects will have SSE-S3 encrypted enabled, but this not the same key as the source bucket’s SSE-S3 key, since they are in different AWS accounts.

	For more information please refer to the links below:

	[1] https://docs.aws.amazon.com/AmazonS3/latest/userguide/replication-change-owner.html#repl-ownership-add-role-permission
	[2] https://docs.aws.amazon.com/kms/latest/developerguide/concepts.html#aws-managed-cmk
	[3] https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingServerSideEncryption.html