(base) [dyuret@dy02 test]$ cuda-memcheck julia --startup-file=no
_ _ _(_)_ | Documentation:
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.4.2 (2020-05-23)
_/ |\__'_|_|_|\__'_| | Official release
|__/ |
using AutoGrad, CuArrays, Knet, Test
using AutoGrad: @gcheck
using CuArrays: @cufunc
const GConstant01 = sqrt(2/pi)
const GConstant02 = 0.044715 * sqrt(2/pi)
const GConstant03 = GConstant01 / 2
# Main definition, broadcasted version works on Arrays

Noise Contrastive Estimation

Noise contrastive estimation (NCE) replaces the expensive vocabulary-sized softmax operation at the final layer of language (?) models with a cheaper sampling based operation, which results in significant speed-up during training. To motivate NCE, let us start with the basic equations for probabilistic models.

To model the probability distribution $p(y)$ of a set of objects

using CuArrays, FileIO
isfile("foo.jld2") || download("", "foo.jld2")
a = load("foo.jld2","a")
function testalloc(a)
b = []
for n in 1:length(a)
if a[n] == 0
# Script for comparison with the original tensorflow model.
# julia> include("test-npz.jl")
# julia> attntest(src="en",tgt="vi") # to load data
# julia> s = S2S("foo_eval.npz", trn.src.vocab, trn.tgt.vocab) # foo_eval.npz dumped params from tensorflow/nmt
# julia> loss(s, dev) #=> loss=2.5475807f0 ppl=12.776157
# julia> bleu(s, dev) #=> bleu=23.16
using PyCall
np = pyimport("numpy")
rnnbug.ipynb
Created May 23, 2019
bug report from ege ersu
julia> using Knet, Profile
julia> include(Knet.dir("examples/mnist-mlp/mlp.jl"));
julia> @time MLP.main("");
27.817270 seconds (43.59 M allocations: 6.484 GiB, 6.02% gc time)
julia> @time MLP.main("");
4.748176 seconds (4.69 M allocations: 4.245 GiB, 5.49% gc time)
Time Allocations
────────────────────── ───────────────────────
Tot / % measured: 95.2s / 52.7% 11.2GiB / 8.32%
Section ncalls time %tot avg alloc %tot avg
*[1] 25.4k 4.76s 9.49% 187μs 20.8MiB 2.19% -
Knet.A_mul_Bt 25.0k 3.93s 7.85% 157μs 9.55MiB 1.00% -
*.[1] 21.4k 4.44s 8.86% 208μs 21.1MiB 2.22% 1.01KiB