Leo leomrocha

## wsl2-network.ps1
$remoteport = bash.exe -c "ifconfig eth0 | grep 'inet '"
$found = $remoteport -match '\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}';

if( $found ){
  $remoteport = $matches[0];
  echo $remoteport
} else{
  echo "The Script Exited, the ip address of WSL 2 cannot be found";
  exit;
}

## tokenizations_post.md

      
              1 file
            
          
              3 forks
            
          
              0 comments
            
          
              65 stars
            
          
                tamuhey
                / tokenizations_post.md
            
            
              Last active
              July 27, 2024 14:46
            
              
                How to calculate the alignment between BERT and spaCy tokens effectively and robustly
              
          
    How to calculate the alignment between BERT and spaCy tokens effectively and robustly


site: https://tamuhey.github.io/tokenizations/
Natural Language Processing (NLP) has made great progress in recent years because of neural networks, which allows us to solve various tasks with end-to-end architecture. However, many NLP systems still require language-specific pre- and post-processing, especially in tokenizations. In this article, I describe an algorithm that simplifies calculating correspondence between tokens (e.g. BERT vs. spaCy), one such process. And I introduce Python and Rust libraries that implement this algorithm.
Here are the library and the demo site links:

repo: https://github.com/tamuhey/tokenizations


## top-k-top-p.py
def top_k_top_p_filtering(logits, top_k=0, top_p=0.0, filter_value=-float('Inf')):
    """ Filter a distribution of logits using top-k and/or nucleus (top-p) filtering
        Args:
            logits: logits distribution shape (vocabulary size)
            top_k >0: keep only top k tokens with highest probability (top-k filtering).
            top_p >0.0: keep the top tokens with cumulative probability >= top_p (nucleus filtering).
                Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751)
    """
    assert logits.dim() == 1  # batch size 1 for now - could be updated for more but the code would be less clear
    top_k = min(top_k, logits.size(-1))  # Safety check

## freeze_example.py
import torch
from torch import nn
from torch.autograd import Variable
import torch.nn.functional as F
import torch.optim as optim


# toy feed-forward net
class Net(nn.Module):
    def __init__(self):

## pytorch-variable-number-of-inputs.ipynb

      
              1 file
            
          
              1 fork
            
          
              1 comment
            
          
              3 stars
            
          
                andreh7
                / pytorch-variable-number-of-inputs.ipynb
            
            
              Last active
              March 10, 2021 15:15
            
              
                Learning a function with a variable number of inputs with PyTorch
              
          
        Loading

      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## metainfo.py
"""Extract metadata from Project Gutenberg RDF catalog into a Python dict.

Based on https://bitbucket.org/c-w/gutenberg/

>>> md = readmetadata()
>>> md[123]
{'LCC': {'PS'},
 'author': u'Burroughs, Edgar Rice',
 'authoryearofbirth': 1875,
 'authoryearofdeath': 1950,
	$remoteport = bash.exe -c "ifconfig eth0 \| grep 'inet '"
	$found = $remoteport -match '\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}';

	if( $found ){
	$remoteport = $matches[0];
	echo $remoteport
	} else{
	echo "The Script Exited, the ip address of WSL 2 cannot be found";
	exit;
	}
	def top_k_top_p_filtering(logits, top_k=0, top_p=0.0, filter_value=-float('Inf')):
	""" Filter a distribution of logits using top-k and/or nucleus (top-p) filtering
	Args:
	logits: logits distribution shape (vocabulary size)
	top_k >0: keep only top k tokens with highest probability (top-k filtering).
	top_p >0.0: keep the top tokens with cumulative probability >= top_p (nucleus filtering).
	Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751)
	"""
	assert logits.dim() == 1 # batch size 1 for now - could be updated for more but the code would be less clear
	top_k = min(top_k, logits.size(-1)) # Safety check
	import torch
	from torch import nn
	from torch.autograd import Variable
	import torch.nn.functional as F
	import torch.optim as optim


	# toy feed-forward net
	class Net(nn.Module):
	def __init__(self):
	"""Extract metadata from Project Gutenberg RDF catalog into a Python dict.

	Based on https://bitbucket.org/c-w/gutenberg/

	>>> md = readmetadata()
	>>> md[123]
	{'LCC': {'PS'},
	'author': u'Burroughs, Edgar Rice',
	'authoryearofbirth': 1875,
	'authoryearofdeath': 1950,