This is a list of thought-provoking questions that Engineers should ask themselves while writing code and reviewing other people's code. This list was created by Senior+ Engineers in an effort to share their experience with others.
- People who are writing code but are not able to get code reviews from tenured engineers (i.e those not in an Engineering organization)
- Engineers who are looking to learn from what Senior+ engineers do and apply that to their work
- If you're working on a team already, ask them what they look for in Code Reviews so you can add to this list!
- All Engineers!
- Does it do what it's supposed to do?
- Is this code using industry-best coding standards, styles, and practices?
- Does it look like the code around it? If not, why not? Keep in mind that we shouldn't proliferate legacy code/coding styles.
- Is this PR easy to review? If not, what makes it difficult, and how can we address that?
- Can we (or should we) split it up into smaller PRs to focus the scope of reviews and changes?
- Are the right reviewers added to this PR?
- Does this PR do a refactor (or file moves) and feature work? Can we split the refactor out to a PR first so reviewing it is easier?
- Are there tests? Are they useful and correct?
- If you were to make a major change to this code, how would you know if you broke something?
- Do I know how to manually test this? Do others know how to manually test this based on the PR description?
- Does this code handle errors correctly?
- What kind of errors can happen in this code?
- Are the correct types used?
- If unsafe type assertions are being used (
as
,!
,any
) are they necessary, correct, and commented (if the reason for the assertion is unclear)?
- Do we need to update public or internal API documentation?
- If we're modifying an existing endpoint, what happens to clients that haven't updated yet ("Stale Clients")?
- If something that was previously optional is now required, how do we handle that? Do we have systems in place to refresh stale clients?
- If querying data, are we hitting the right index(es)? If not, consider changing the query or adding an index.
- If querying data, what's the theoretical result set size? Is it bounded? If not, consider adding a limit.
- If writing or deleting data, are we sure that the user has the correct permissions to do so?
- If using user-provided data in a query, have we validated it?
- What happens if something like
deleteUser(user.id)
is called anduser.id
is undefined? - Have we validated the data before writing it to the database?
- Do we have the proper timeouts, retries, and error handling?
- What is our rate limit for this service?
- How close are we to hitting the rate limit?
- What happens if we hit the rate limit?
- What happens if the service is down?
- If we're changing a queue event handler, think of events that are in the queue but haven't been processed yet. Do we need to handle old events differently?
- If introducing a new queue event handler, do we have the right metrics to monitor queue depth and latency?
- How big is the thing we're iterating over now? Is it a fixed size or dynamic based on something from the user?
- If we wanted to 10x the size of the thing in the future, how much work would be required to support it?
- Are we doing async operations in the iteration?
- Should the async operations be done concurrently or serially? Think
Promise.all()
vsfor await of
. - If concurrently, can the systems (network, database, CPU, etc) safely do N operations at one time? If not, consider batching or limiting concurrency.
- How do we handle errors? Does an error in one iteration stop the entire loop?
- Is CI passing?
- If it's failing, is it from changes made in this PR, or is it a known flaky test?
- Are there new log lines? If not, should there be?
- Are the logs at the right logging level, balancing noise vs signal? Logs can be expensive.
- Are the logs easily traceable to code during an incident?
- Have we included relevant context (e.g., request IDs, user IDs) in log messages?
- Can these logs accidentally log any senstivie information?
- Do we need to add analytics to track user behavior for this change?
- Does the Analytics/Data team need to know about this change?
- How do we know if it's working or not working in production?
- Do we need to update or create any dashboards?
- Do we need to update or create any alerts?
- Will users be able to immediately use this in production?
- Is this behind a feature flag? If not, should it be?
- If it's behind a feature flag, how do we manage the rollout? Who has access? Is there documentation?
- Is this PR safely revertable? If not, why not? How do we mitigate that risk?
- How do we roll it back?
- How quickly can we roll it back?
- Who has access to roll it back?
- Do we have plans to actually follow through with the deprecation?
- How do we identify the users/customers that are using a now-deprecated feature and help them migrate to the new version?
- What timeframe are we putting in place for the deprecation EOL?