➡️ View/Download the Detailed GSoC Final Report (PDF)
- Name: Divyaprakash
- GitHub: divyaprakash-iitd
- Website: dpcfd.com
- Project: Using Data-Driven, Physics-Informed ML to Model Fluid Properties in CFD
- Organization: SU2 Foundation
SU2 is an open source suite for computational fluid dynamics used in research and design. MLPCpp is a compact C plus plus library inside SU2 that enables multilayer perceptron inference within solver workflows for tasks such as fluid property prediction.
This project aimed to enhance SU2 and MLPCpp by addressing performance bottlenecks in neural network inference and by improving usability for data driven workflows. Proposed improvements included hash maps for faster variable lookups, a refactor of network selection and evaluation with SIMD and batched inference, and more robust inverse regression methods informed by the literature. During GSoC I integrated a runtime profiler and its documentation, implemented variable mapping improvements now under review, and initiated the network evaluation refactor, laying the groundwork for the remaining optimizations and benchmarks.
- Implementing profiling tool (merged)
- Documentation (merged, can be improved further)
- Optimizing variable mapping (completed, PR not merged yet / in review)
- Optimizing network evaluation (completed, PR not merged yet / in review)
Note: The original proposal included additional deliverables. The above list reflects the key contributions I was able to complete during GSoC 2025.
I explored existing tooling for performance analysis inside SU2. The built in profiler options did not provide the detailed call level visibility and ease of use we needed. Valgrind gave very fine detail but its overhead made it impractical for regular use. After evaluation I integrated the Tracy profiler because it offers low overhead sampling, real time visualization, and easy instrumentation for C and C plus plus projects. I created and merged a pull request that allows the develop branch of SU2 to be compiled with Tracy profiling enabled. The integration includes build options and an example of how to instrument code regions for tracing. The website documentation for using Tracy with SU2 was added in a separate merged documentation pull request. See the Tracy pull request and the documentation pull request in the Links section for direct details and instructions.
I worked on the MLPCpp subproject to improve how variables are looked up and mapped at runtime. The previous approach resulted in repeated lookups that added overhead in tight loops. I implemented a hash map based mapping and validated it with tests on representative cases. The change reduces repeated scanning and improves lookup performance for typical cases in MLPCpp. The implementation is available in my forked MLPCpp repository and the related change is currently under review.
I refactored the internal storage format for network weights from a nested std::vector implementation to a triple-pointer structure. This change eliminated the dynamic memory allocation overhead of vectors, which was unnecessary since neural network weights are sized only once and do not require runtime resizing. Following the KISS (keep it simple, stupid) principle, the pointer-based approach provides a simpler, more direct solution that improves memory access efficiency and cache locality. Additionally, the neural network evaluation function was optimized by caching neuron counts before entering loops and reducing redundant array accesses.
These changes yielded a 1.33× speedup, corresponding to a 24.6% reduction in execution time, when tested on the MLPCpp repository's included testcase. However, when integrated into full SU2 simulations, these performance gains were not replicated, highlighting the importance of developing test cases that accurately reflect real-world deployment conditions. The improvements have been submitted via a pull request (see Link section), and future work will focus on creating more representative benchmarks to better validate optimization efforts.
I added usage instructions and examples so new contributors can instrument SU2 with Tracy and reproduce profiling results. I also maintained a blog series on my personal website that documents design decisions, profiling results, and lessons learned during the project. The blog series includes step by step notes that will help new contributors adopt the same workflow.
- Representative test case development: Develop test cases that more accurately reflect the computational environment and workload characteristics of full SU2 simulations, enabling better validation of performance optimizations in realistic deployment scenarios.
- Flattened weight storage: Complete the debugging of the flattened vector implementation and compare its performance against the nested vector version to evaluate the actual performance gains in both standalone and integrated environments.
- Refactor variable mapping: Finalize replacement of the
FindVariableIndicesfunction with a direct, maintainable mapping for faster lookups. - Tracing support: Develop a Tracy tutorial demonstrating how to compile SU2 with tracing, instrument code sections, and interpret performance traces.
- Documentation: Expand
MLPCppdocumentation and onboarding notes, including example cases for easier adoption by new users.
I would like to sincerely thank Google and the SU2 community for providing me with the opportunity to contribute to this project through GSoC 2025. I am especially grateful to my mentors, Evert Bunschoten and Joshua A. Kelly, for their constant guidance, encouragement, and support throughout the project. Their approachable and friendly mentorship created an excellent learning environment, and their insights were invaluable in shaping the progress and outcomes of this work. I am also thankful to the wider SU2 developer community for their feedback, code reviews, and discussions, which greatly enriched the overall experience.