Skip to content

Instantly share code, notes, and snippets.

@kgilpin
kgilpin / lint-errors.xml
Created June 25, 2024 15:02
Navie lint repair
<lint-errors>
sklearn/utils/validation.py:485:26: F821 undefined name 'pd'
</lint-errors>
<diff>--- /home/runner/work/SWE-bench/SWE-bench/logs/solve/scikit-learn__scikit-learn-14878/2/diff/sklearn_utils_validation.py/pre/base 2024-06-15 20:45:28.079868186 +0000
+++ /home/runner/work/SWE-bench/SWE-bench/logs/solve/scikit-learn__scikit-learn-14878/2/diff/sklearn_utils_validation.py/pre/updated 2024-06-15 20:45:28.079868186 +0000
@@ -338,8 +338,7 @@
dtype="numeric", order=None, copy=False, force_all_finite=True,
ensure_2d=True, allow_nd=False, ensure_min_samples=1,
ensure_min_features=1, warn_on_dtype=None, estimator=None):
-
@kgilpin
kgilpin / solution.xml
Created June 25, 2024 15:00
Navie-generated code change
<changeset>
<change>
<file change-number-for-this-file="1">sklearn/preprocessing/_encoders.py</file>
<original line-count="14" no-ellipsis="true"><![CDATA[
for i in range(n_features):
Xi = X[:, i]
diff, valid_mask = _encode_check_unknown(Xi, self.categories_[i],
return_mask=True)
if not np.all(valid_mask):
@kgilpin
kgilpin / plan.md
Last active June 25, 2024 14:57
scikit-learn-12471

Fixes: scikit-learn/scikit-learn#12470

Title: Fix OneHotEncoder to Safely Handle String Categories for ignore Unknown Strategy

Problem: The OneHotEncoder from scikit-learn raises a ValueError during the transform method when handle_unknown='ignore' is set and the categories are strings. This occurs if the string length of any unknown category being transformed exceeds the length of the strings encountered during fitting. The error arises because OneHotEncoder.categories_[i][0] (the first category) is being used to replace unknown entries, and if it is a longer string than the target array's dtype allows, this string gets truncated, causing subsequent array operations to fail.

Analysis: The root cause of the issue is the discrepancy in memory handling between strings of different lengths when dealing with NumPy arrays. Specifically, when the handle_unknown='ignore' option is used, unknown categories are replaced by a known category from the `categories_

@kgilpin
kgilpin / sequenceDiagram.js
Created March 3, 2023 14:08
Read diagram data from a ZIP file
const params = new URL(document.location).searchParams;
const diagramUrl = params.get('diagram');
let diagramData;
if (diagramUrl.endsWith('.zip')) {
const resourceId = params.get('resourceId');
if (!resourceId) throw new Error(`resourceId is required with diagram resource ZIP file`);
const localStorageKey = ['appmap', 'diagram', diagramUrl].join('-');
let diagramDataEncoded;
if (!['false', 'no'].includes(params.get('cache')))
@kgilpin
kgilpin / Following_followers_page.appmap.json
Created November 29, 2022 18:55
Example files from blog post "Generate sequence diagrams from runtime code analysis"
{
"events": [
{
"id": 1,
"event": "call",
"thread_id": 4280,
"defined_class": "ActionDispatch::Integration::Runner",
"method_id": "before_setup",
"path": "/Users/kgilpin/.rbenv/versions/3.0.2/lib/ruby/gems/3.0.0/gems/actionpack-6.0.4.1/lib/action_dispatch/testing/integration.rb",
"lineno": 320,
@kgilpin
kgilpin / appmap.rake
Created November 29, 2022 16:19
Rake task to detect untested routes
# Sort by path, then by method
compare_routes = lambda do |a,b|
compare = a[1] <=> b[1]
compare = a[0] <=> b[0] if compare == 0
compare
end
normalize_method = lambda do |method|
method = method.upcase
# For Rails, HEAD request is served by 'show' (GET)
@kgilpin
kgilpin / pre_request.js
Last active July 14, 2022 16:59
Rails Sample App 6th Ed Postman Pre-request script
pm.sendRequest(pm.variables.get('baseUrl'), function (err, response) {
const token = /meta\s+name="csrf-token"\s+content="(.*)"/.exec(response.text());
console.log(`CSRF protection: ${token}`);
if( token) {
pm.globals.set('authenticity_token', token[1]);
}
})
@kgilpin
kgilpin / head_sql.yml
Last active July 13, 2021 15:40
Example files - Preventing data leaks with runtime code analysis
- DELETE FROM "ahoy_messages" WHERE "ahoy_messages"."user_id" = $?
- DELETE FROM "api_secrets" WHERE "api_secrets"."id" = $?
- DELETE FROM "api_secrets" WHERE "api_secrets"."user_id" = $?
- DELETE FROM "articles" WHERE "articles"."id" = $?
- DELETE FROM "badge_achievements" WHERE "badge_achievements"."id" = $?
- DELETE FROM "badge_achievements" WHERE "badge_achievements"."user_id" = $?
- DELETE FROM "broadcasts" WHERE "broadcasts"."id" = $?
- >-
DELETE FROM "chat_channel_memberships" WHERE "chat_channel_memberships"."id"
= $?
{
"$schema": "https://aka.ms/codetour-schema",
"title": "Install AppMap for RSpec",
"steps": [
{
"file": "spec/spec_helper.rb",
"description": "You'll now install the AppMap RSpec helper.\n\n```ruby\nrequire 'appmap/rspec'\n```\n\nThis line should be placed **before** any other `require` statements.\n",
"line": 2,
"contents": "require 'appmap/rspec'"
},