This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[0,29] . D============eeeER . . . . . . . . . . . . vaddps ymm0, ymm0, ymm1 | |
[0,31] . D==============eeeER. . . . . . . . . . . . vaddps ymm0, ymm0, ymm1 | |
[0,33] . D=================eeeER . . . . . . . . . . . vaddps ymm0, ymm0, ymm1 | |
[0,35] . D====================eeeER . . . . . . . . . . vaddps ymm0, ymm0, ymm1 | |
[0,38] . .D======================eeeER . . . . . . . . . . vaddps ymm0, ymm0, ymm1 | |
[0,41] . .D=========================eeeER . . . . . . . . . vaddps ymm0, ymm0, ymm1 | |
[0,44] . . D===========================eeeER. . . . . . . . . vaddps ymm0, ymm0, ymm1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Iterations: 100 | |
Instructions: 12600 | |
Total Cycles: 10225 | |
Total uOps: 22600 | |
Dispatch Width: 6 | |
uOps Per Cycle: 2.21 | |
IPC: 1.23 | |
Block RThroughput: 37.7 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
.section .text,"xr",one_only,blogpost::avx2_dot::avx2_dot_product | |
.globl blogpost::avx2_dot::avx2_dot_product | |
.p2align 4, 0x90 | |
blogpost::avx2_dot::avx2_dot_product: | |
.cv_func_id 6 | |
.seh_proc _ZN8blogpost8avx2_dot16avx2_dot_product17hffb9005f074b96fbE | |
sub rsp, 104 | |
.seh_stackalloc 104 | |
.seh_endprologue | |
mov qword ptr [rsp + 40], rdx |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Iterations: 100 | |
Instructions: 12600 | |
Total Cycles: 10225 | |
Total uOps: 22600 | |
Dispatch Width: 6 | |
uOps Per Cycle: 2.21 | |
IPC: 1.23 | |
Block RThroughput: 37.7 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Iterations: 100 | |
Instructions: 12600 | |
Total Cycles: 10225 | |
Total uOps: 22600 | |
Dispatch Width: 6 | |
uOps Per Cycle: 2.21 | |
IPC: 1.23 | |
Block RThroughput: 37.7 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
use core::arch::x86_64::*; | |
#[target_feature(enable = "avx2")] | |
unsafe fn avx2_dot_product(vector_a: &[f32], vector_b: &[f32]) -> f32 { | |
assert_eq!(vector_a.len(), vector_b.len(), "Vectors must be equal in length"); | |
let dims = vector_a.len(); | |
// We want to ensure we're going in steps of 8 elements | |
// and then handle the remainder separately. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
fn simple_dot_product(vector_a: &[f32], vector_b: &[f32]) -> f32 { | |
assert_eq!(vector_a.len(), vector_b.len(), "Vectors must be equal in length"); | |
let dims = vector_a.len(); | |
let mut total = 0.0; | |
// Quick disclaimer: I'm going to be using while loops here for consistency across | |
// the code samples. | |
let mut i = 0; | |
while i < dims { |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import {Jumper} from "svelte-loading-spinners"; | |
import {DISCORD_LOGIN_URI} from "$lib/api/config"; | |
import {onMount} from "svelte"; | |
import {ALLOWED_ORIGIN, flags} from "../../lib/api/config.js"; | |
import {extractUserData, requestUserAccess} from "../../lib/api/auth.js"; | |
import {accessToken, currentUser} from "../../lib/stores/auth.js"; | |
import {goto} from "$app/navigation"; | |
import Primary from "../../lib/compontents/buttons/Primary.svelte"; | |
import Secondary from "../../lib/compontents/buttons/Secondary.svelte"; | |
import {writable} from "svelte/store"; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
use std::io; | |
use std::path::{Path, PathBuf}; | |
use std::sync::Arc; | |
use std::time::{Duration, Instant}; | |
use ahash::{HashMap, HashMapExt, HashSet}; | |
use parking_lot::Mutex; | |
use tokio::task::JoinHandle; | |
use crate::backends::ReadBuffer; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import asyncio | |
import os | |
import time | |
import cv2 | |
import numpy | |
import pygame | |
import shutil | |
from threading import Thread | |
from queue import Queue |
NewerOlder