Skip to content

Instantly share code, notes, and snippets.

@DavidKorczynski
Created August 18, 2024 17:06
Show Gist options
  • Save DavidKorczynski/d16bf21a433931d6c8be9f5a4048f48e to your computer and use it in GitHub Desktop.
Save DavidKorczynski/d16bf21a433931d6c8be9f5a4048f48e to your computer and use it in GitHub Desktop.
Java LLM-based fuzz harness generation for DiffLib
#### Prompt
You are a security testing engineer who wants to write a Java program to execute all lines in a given method by defining and initialising its parameters and necessary objects in a suitable way before fuzzing the method.
Carefully study the method signature and its parameters, then follow the example problems and solutions to answer the final problem. YOU MUST call the target method to fuzz in the solution.
The <target> tag contains information of the target method to invoke.
The <arguments> tag contains information of each of the target method arguments.
The <exceptions> tag contains a list of exceptions thrown by the target method that you MUST catch.
The <constructor> tag contains constructor or method call details you MUST use to create the needed object before calling the target method.
The <requirement> tag contains additional requirements that you MUST follow for this code generation.
<target>
Your goal is to write a fuzzing harness for the provided method signature to fuzz the method with random data. It is important that the provided solution compiles and actually calls the function specified by the method signature:
<method_signature>
[com.github.difflib.text.DiffRowGenerator].generateDiffRows(java.util.List<String>,com.github.difflib.patch.Patch<String>)
</method_signature>
The method signature follows the format of <code>[Fully qualified name of the class].method_name(method_arguments)</code>.
For example, for a method <code>test</code> in class <code>Test</code> of package <code>org.test</code> which takes in a single integer would have the following method signature:
<code>[org.test.Test].test(int)</code>
The target method belongs to the Java project java-diff-utils (https://github.com/java-diff-utils/java-diff-utils).
</target>
<arguments>
1. Argument #0 requires a java.util.List instance with a generic type of String. You MUST create an empty java.util.List<String> instance, then fill the list with multiple DIFFERENT String objects generated by FuzzedDataProvider::consumeString(int) or FuzzedDataProvider::consumeAsciiString(int) or FuzzedDataProvider::consumeRemainingAsString() or FuzzedDataProvider::consumeRemainingAsAsciiString() or FuzzedDataProvider::pickValue(String[]) methods..
2. Argument #1 requires a com.github.difflib.patch.Patch instance with a generic type of String. You MUST create two empty java.util.List<String> instance, then fill the two lists with multiple DIFFERENT String objects generated by FuzzedDataProvider::consumeString(int) or FuzzedDataProvider::consumeAsciiString(int) or FuzzedDataProvider::consumeRemainingAsString() or FuzzedDataProvider::consumeRemainingAsAsciiString() or FuzzedDataProvider::pickValue(String[]) methods. After the two lists creation, use these newly created lists to invoke the STATIC method com.github.difflib.DiffUtils.diff(java.util.List<String>,java.util.List<String>) to generate a com.github.difflib.patch.Patch instance with generic type of String.
</arguments>
<constructor>
<signature>DiffRowGenerator.Builder.build()</signature>
<prerequisite>
You MUST call the STATIC method DiffRowGenerator.create() to retrieve an instance of DiffRowGenerator.Builder before invoking DiffRowGenerator.Builder.build() to generate a com.github.difflib.text.DiffRowGenerator instance.
</prerequisite>
</constructor>
<requirements>
Please fulfil all the requirements in the following list.
1. Try as many variations of these inputs as possible.
2. Try creating the harness as complex as possible.
3. Try adding some nested loop to invoke the target method for multiple times with different random data.
4. The generated fuzzing harness should be wrapped with the <java_code> tag.
5. NEVER use any methods from the <code>java.lang.Random</code> class in the generated code.
6. NEVER use any classes or methods in the <code>java.lang.reflect</code> package in the generated code.
7. NEVER use the @FuzzTest annotation for specifying the fuzzing method.
8. Please avoid using any multithreading or multi-processing approach.
9. Please add import statements for necessary classes, except for classes in the java.lang package.
10. You must create the com.github.difflib.text.DiffRowGenerator object before calling the target method.
11. Do not create new variables with the same names as existing variables.
WRONG_CODE:
<code>
public static void testing(int test) {
String test = "Testing;
}
</code>
12. Always create the fuzzing harness from the following templates.
<code>
import com.code_intelligence.jazzer.api.FuzzedDataProvider;
// Other imports
public class DiffUtilsFuzzer {
public static void fuzzerInitialize() {
// Initializing objects for fuzzing
}
public static void fuzzerTearDown() {
// Tear down objects after fuzzing
}
public static void fuzzerTestOneInput(FuzzedDataProvider data) {
// Use the FuzzedDataProvider object to generate random data for fuzzing
// Fuzz by invoking the target method with random parameters / objects generated above.
}
}
</code>
</requirements>
<data_mapping>
Here is a markdown table showing methods that should be used to generate random data of some argument types:
| Argument types | Methods for generating random data |
| int or java.lang.Integer | FuzzedDataProvider::consumeInt() or FuzzedDataProvider::consumeInt(int, int) or FuzzedDataProvider::pickValue(int[]) |
| int[] | FuzzedDataProvider::consumeInts(int) or FuzzedDataProvider::pickValues(T[], int) or FuzzedDataProvider::pickValues(Collection<T>, int) |
| java.lang.Integer[] | new Integer[]{int...} |
| boolean or java.lang.Boolean | FuzzedDataProvider::consumeBoolean() or FuzzedDataProvider::pickValue(boolean[]) |
| boolean[] | FuzzedDataProvider::consumeBooleans(int) or FuzzedDataProvider::pickValues(T[], int) or FuzzedDataProvider::pickValues(Collection<T>, int) |
| java.lang.Boolean[] | new Boolean[]{boolean...} |
| byte or java.lang.Byte | FuzzedDataProvider::consumeByte() or FuzzedDataProvider::consumeByte(byte,byte) or FuzzedDataProvider::pickValue(byte[]) |
| byte[] | FuzzedDataProvider::consumeBytes(int) or FuzzedDataProvider::consumeRemainingAsBytes() or FuzzedDataProvider::pickValues(T[], int) or FuzzedDataProvider::pickValues(Collection<T>, int) |
| java.lang.Byte[] | new Byte[] {byte...} |
| short or java.lang.Short | FuzzedDataProvider::consumeShort() or FuzzedDataProvider::consumeShort(short, short) or FuzzedDataProvider::pickValue(short[]) |
| short[] | FuzzedDataProvider::consumeShorts(int) or FuzzedDataProvider::pickValues(T[], int) or FuzzedDataProvider::pickValues(Collection<T>, int) |
| java.lang.Short[] | new Short[] {short...} |
| long or java.lang.Long | FuzzedDataProvider::consumeLong() or FuzzedDataProvider::consumeLong(long, long) or FuzzedDataProvider::pickValue(long[]) |
| long[] | FuzzedDataProvider::consumeLongs(int) or FuzzedDataProvider::pickValues(T[], int) or FuzzedDataProvider::pickValues(Collection<T>, int) |
| java.lang.Long[] | new Long[] {long...} |
| float or java.lang.Float | FuzzedDataProvider::consumeFloat() or FuzzedDataProvider::consumeRegularFloat() or FuzzedDataProvider::consumeRegularFloat(float, float) or FuzzedDataProvider::consumeProbabilityFloat() or or FuzzedDataProvider::pickValue(float[]) |
| double or java.lang.Double | FuzzedDataProvider::consumeDouble() or FuzzedDataProvider::consumeRegularDouble() or FuzzedDataProvider::consumeRegularDouble(double, double) or FuzzedDataProvider::consumeProbabilityDouble() or FuzzedDataProvider::pickValue(double[]) |
| char or java.lang.Character | FuzzedDataProvider::consumeChar() or FuzzedDataProvider::consumeCharNoSurrogates() or FuzzedDataProvider::consumeChar(char, char) or FuzzedDataProvider::pickValue(char[]) |
| java.lang.String or java.lang.CharSequence | FuzzedDataProvider::consumeString(int) or FuzzedDataProvider::consumeAsciiString(int) or FuzzedDataProvider::consumeRemainingAsString() or FuzzedDataProvider::consumeRemainingAsAsciiString() or FuzzedDataProvider::pickValue(String[]) |
<data_mapping>
#### Generated harness
```java
import com.code_intelligence.jazzer.api.FuzzedDataProvider;
import com.github.difflib.DiffUtils;
import com.github.difflib.patch.Patch;
import com.github.difflib.text.DiffRowGenerator;
import java.util.List;
public class DiffUtilsFuzzer {
private static DiffRowGenerator.Builder builder;
public static void fuzzerInitialize() {
builder = DiffRowGenerator.create();
}
public static void fuzzerTearDown() {
// Clean up resources after fuzzing
}
public static void fuzzerTestOneInput(FuzzedDataProvider data) {
List<String> list1 = new ArrayList<>();
list1.add(data.consumeString(10));
list1.add(data.consumeAsciiString(5));
list1.add(data.consumeRemainingAsString());
List<String> list2 = new ArrayList<>();
list2.add(data.consumeString(8));
list2.add(data.consumeAsciiString(6));
list2.add(data.consumeRemainingAsAsciiString());
Patch<String> patch = DiffUtils.diff(list1, list2);
DiffRowGenerator generator = builder.build();
List<String> diffRows = generator.generateDiffRows(list1, patch);
// Nested loop to invoke the target method multiple times with different random data
for (int i = 0; i < 3; i++) {
List<String> nestedList1 = new ArrayList<>();
nestedList1.add(data.consumeString(7));
nestedList1.add(data.consumeAsciiString(4));
nestedList1.add(data.consumeRemainingAsString());
List<String> nestedList2 = new ArrayList<>();
nestedList2.add(data.consumeString(9));
nestedList2.add(data.consumeAsciiString(3));
nestedList2.add(data.consumeRemainingAsAsciiString());
Patch<String> nestedPatch = DiffUtils.diff(nestedList1, nestedList2);
List<String> nestedDiffRows = generator.generateDiffRows(nestedList1, nestedPatch);
}
}
}
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment