Skip to content

Instantly share code, notes, and snippets.

@divyaprakash-iitd
Last active August 31, 2025 10:54
Show Gist options
  • Select an option

  • Save divyaprakash-iitd/998eaf88bcc1b1325c29a53dca8a3e91 to your computer and use it in GitHub Desktop.

Select an option

Save divyaprakash-iitd/998eaf88bcc1b1325c29a53dca8a3e91 to your computer and use it in GitHub Desktop.
GSoC 2025 Final Report

GSoC 2025 Final Report

➡️ View/Download the Detailed GSoC Final Report (PDF)

Table of Contents

Personal Information

  • Name: Divyaprakash
  • GitHub: divyaprakash-iitd
  • Website: dpcfd.com
  • Project: Using Data-Driven, Physics-Informed ML to Model Fluid Properties in CFD
  • Organization: SU2 Foundation

Overview

SU2 is an open source suite for computational fluid dynamics used in research and design. MLPCpp is a compact C plus plus library inside SU2 that enables multilayer perceptron inference within solver workflows for tasks such as fluid property prediction.

This project aimed to enhance SU2 and MLPCpp by addressing performance bottlenecks in neural network inference and by improving usability for data driven workflows. Proposed improvements included hash maps for faster variable lookups, a refactor of network selection and evaluation with SIMD and batched inference, and more robust inverse regression methods informed by the literature. During GSoC I integrated a runtime profiler and its documentation, implemented variable mapping improvements now under review, and initiated the network evaluation refactor, laying the groundwork for the remaining optimizations and benchmarks.

Deliverables

  • Implementing profiling tool (merged)
  • Documentation (merged, can be improved further)
  • Optimizing variable mapping (completed, PR not merged yet / in review)
  • Optimizing network evaluation (completed, PR not merged yet / in review)

Note: The original proposal included additional deliverables. The above list reflects the key contributions I was able to complete during GSoC 2025.

Work Done

1. Profiling framework evaluation and integration

I explored existing tooling for performance analysis inside SU2. The built in profiler options did not provide the detailed call level visibility and ease of use we needed. Valgrind gave very fine detail but its overhead made it impractical for regular use. After evaluation I integrated the Tracy profiler because it offers low overhead sampling, real time visualization, and easy instrumentation for C and C plus plus projects. I created and merged a pull request that allows the develop branch of SU2 to be compiled with Tracy profiling enabled. The integration includes build options and an example of how to instrument code regions for tracing. The website documentation for using Tracy with SU2 was added in a separate merged documentation pull request. See the Tracy pull request and the documentation pull request in the Links section for direct details and instructions.

2. Variable mapping optimization in MLPCpp

I worked on the MLPCpp subproject to improve how variables are looked up and mapped at runtime. The previous approach resulted in repeated lookups that added overhead in tight loops. I implemented a hash map based mapping and validated it with tests on representative cases. The change reduces repeated scanning and improves lookup performance for typical cases in MLPCpp. The implementation is available in my forked MLPCpp repository and the related change is currently under review.

3. Neural network evaluation refactor and performance improvements

I refactored the internal storage format for network weights from a nested std::vector implementation to a triple-pointer structure. This change eliminated the dynamic memory allocation overhead of vectors, which was unnecessary since neural network weights are sized only once and do not require runtime resizing. Following the KISS (keep it simple, stupid) principle, the pointer-based approach provides a simpler, more direct solution that improves memory access efficiency and cache locality. Additionally, the neural network evaluation function was optimized by caching neuron counts before entering loops and reducing redundant array accesses.

These changes yielded a 1.33× speedup, corresponding to a 24.6% reduction in execution time, when tested on the MLPCpp repository's included testcase. However, when integrated into full SU2 simulations, these performance gains were not replicated, highlighting the importance of developing test cases that accurately reflect real-world deployment conditions. The improvements have been submitted via a pull request (see Link section), and future work will focus on creating more representative benchmarks to better validate optimization efforts.

4. Documentation and outreach

I added usage instructions and examples so new contributors can instrument SU2 with Tracy and reproduce profiling results. I also maintained a blog series on my personal website that documents design decisions, profiling results, and lessons learned during the project. The blog series includes step by step notes that will help new contributors adopt the same workflow.

Future Work

  1. Representative test case development: Develop test cases that more accurately reflect the computational environment and workload characteristics of full SU2 simulations, enabling better validation of performance optimizations in realistic deployment scenarios.
  2. Flattened weight storage: Complete the debugging of the flattened vector implementation and compare its performance against the nested vector version to evaluate the actual performance gains in both standalone and integrated environments.
  3. Refactor variable mapping: Finalize replacement of the FindVariableIndices function with a direct, maintainable mapping for faster lookups.
  4. Tracing support: Develop a Tracy tutorial demonstrating how to compile SU2 with tracing, instrument code sections, and interpret performance traces.
  5. Documentation: Expand MLPCpp documentation and onboarding notes, including example cases for easier adoption by new users.

Acknowledgement

I would like to sincerely thank Google and the SU2 community for providing me with the opportunity to contribute to this project through GSoC 2025. I am especially grateful to my mentors, Evert Bunschoten and Joshua A. Kelly, for their constant guidance, encouragement, and support throughout the project. Their approachable and friendly mentorship created an excellent learning environment, and their insights were invaluable in shaping the progress and outcomes of this work. I am also thankful to the wider SU2 developer community for their feedback, code reviews, and discussions, which greatly enriched the overall experience.

Links

  1. Tracy pull request for SU2 integration
  2. Tracy documentation pull request on the SU2 website repository
  3. Weights matrix refactoring and compute functions optimization pull request
  4. My forked MLPCpp repository
  5. My blog series documenting the project
  6. Tracy documentation pdf
  7. Detailed final report
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment