Skip to content

Instantly share code, notes, and snippets.

@ohno
Last active December 19, 2022 02:18
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ohno/239c6452a2633f24a403f2bad21081dc to your computer and use it in GitHub Desktop.
Save ohno/239c6452a2633f24a403f2bad21081dc to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"id": "06a83c9f",
"metadata": {},
"source": [
"# 単精度・倍精度・四倍精度・八倍精度で固有値問題を解く\n",
"\n",
"このノートでは単精度・倍精度・(ついでに)疑似四倍精度・四倍精度・八倍精度で[フランク行列](https://doi.org/10.1137/0106026)・[ヒルベルト行列](https://doi.org/10.1007/BF02418278)の固有値を求めます. フランク行列の場合, 1000×1000程のサイズで単精度では最小固有値が1桁も正しく計算できなくなることを確かめました. より[条件数](https://ja.wikipedia.org/wiki/%E6%9D%A1%E4%BB%B6%E6%95%B0#%E8%A1%8C%E5%88%97%E3%81%AE%E6%9D%A1%E4%BB%B6%E6%95%B0)の大きい[ヒルベルト行列](https://ja.wikipedia.org/wiki/%E3%83%92%E3%83%AB%E3%83%99%E3%83%AB%E3%83%88%E8%A1%8C%E5%88%97)の場合, さらに小さい5×5~35×35程で単精度・倍精度・疑似四倍精度・四倍精度・八倍精度では最小固有値が1桁も正しく計算できなくなることを確かめました. "
]
},
{
"cell_type": "markdown",
"id": "999293a4",
"metadata": {},
"source": [
"## パッケージ・環境\n",
"\n",
"[Quadmath.jl](https://github.com/JuliaMath/DoubleFloats.jl)の四倍精度・Float128は[IEEE 754 binary128](https://en.wikipedia.org/wiki/IEEE_754#Basic_and_interchange_formats)準拠です. ついでに[DoubleFloats.jl](https://github.com/JuliaMath/DoubleFloats.jl)の疑似四倍精度・Double64も追加しました. [QD.jl](https://github.com/eschnett/QD.jl)の八倍精度・Float256はインストール, ビルド時にエラーが出たので, 今回は代わりにBigFloatを使いました. BigFloatの仮数は自由に変えられるので237ビットにして[IEEE 754 binary256](https://en.wikipedia.org/wiki/IEEE_754#Basic_and_interchange_formats)に合わせましたが, [指数部分はデフォルトから変えられないらしい](https://discourse.julialang.org/t/calculate-precision/7484/6)ので[IEEE 754 binary256](https://en.wikipedia.org/wiki/IEEE_754#Basic_and_interchange_formats)より大きくなっています. なお, BigFloatのバックエンドは[MPFR](https://www.mpfr.org/)で, Juliaに最初から入っているので特別なパッケージ等は不要です. 後ほど, `setprecision(237)`のようにして仮数の桁を変えます. 固有値については, Float32とFloat64までは標準ライブラリの[LinearAlgebra.jl](https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/)(固有値部分はLAPACK)で計算できます. Double64, Float128, BigFloatは[GenericLinearAlgebra.jl](https://github.com/JuliaLinearAlgebra/GenericLinearAlgebra.jl)で計算します."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "9eba9c68",
"metadata": {
"scrolled": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Julia Version 1.8.3\n",
"Commit 0434deb161 (2022-11-14 20:14 UTC)\n",
"Platform Info:\n",
" OS: Windows (x86_64-w64-mingw32)\n",
" CPU: 8 × 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz\n",
" WORD_SIZE: 64\n",
" LIBM: libopenlibm\n",
" LLVM: libLLVM-13.0.1 (ORCJIT, tigerlake)\n",
" Threads: 1 on 8 virtual cores\n"
]
}
],
"source": [
"# using Pkg\n",
"# Pkg.add(\"Quadmath\")\n",
"# Pkg.add(\"DoubleFloats\")\n",
"# Pkg.add(\"GenericLinearAlgebra\")\n",
"\n",
"using Printf # 表示\n",
"using DoubleFloats # Double64型のサポート\n",
"using Quadmath # Float128型のサポート\n",
"using LinearAlgebra # Float32, Float64までの固有値計算\n",
"using GenericLinearAlgebra # Double64, Float128, BigFloatの固有値計算\n",
"\n",
"versioninfo()"
]
},
{
"cell_type": "markdown",
"id": "225d0704",
"metadata": {},
"source": [
"## BigFloatを八倍精度の代わりに使う\n",
"\n",
"まず, BigFloatが単精度・倍精度・疑似四倍精度・四倍精度の代わりに使えるのか試します. それぞれの型の仮数部分のビット数はそれぞれ下記のようになっています."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "616d9026",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Float32 \t24\n",
"Float64 \t53\n",
"Double64 \t106\n",
"Float128 \t113\n",
"BigFloat \t256\n"
]
}
],
"source": [
"for t in [Float32, Float64, Double64, Float128, BigFloat]\n",
" println(t, \" \\t\", precision(t))\n",
"end"
]
},
{
"cell_type": "markdown",
"id": "03c8a13b",
"metadata": {},
"source": [
"例えば, `setprecision(53)`とするとBigFloatとFloat64の仮数部分のビット数が一致しますので, 近い振る舞いをすると考えられます. 有理数・Rational型の$1/3$からそれぞれの型に変換すると下記のようになります. ただし, 仮数のビット数が53のBigFloatをBigFloat(53)のように表しました."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "12043577",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Float32 \t0.333333343\n",
"BigFloat(24)\t0.333333343\n",
"\n",
"Float64 \t0.33333333333333331\n",
"BigFloat(53)\t0.33333333333333331\n",
"\n",
"Double64 \t0.333333333333333333333333333333335\n",
"BigFloat(106)\t0.333333333333333333333333333333335\n",
"\n",
"Float128 \t0.333333333333333333333333333333333317\n",
"BigFloat(113)\t0.333333333333333333333333333333333317\n",
"\n",
"BigFloat(237)\t0.3333333333333333333333333333333333333333333333333333333333333333333333326\n",
"\n",
"BigFloat(256)\t0.3333333333333333333333333333333333333333333333333333333333333333333333333333348\n"
]
}
],
"source": [
"for t in [Float32, Float64, Double64, Float128]\n",
" p = precision(t)\n",
" setprecision(p) do\n",
" println(\"$t \\t\", big(convert(t,1//3)))\n",
" println(\"BigFloat($(precision(BigFloat)))\\t\", BigFloat(1//3))\n",
" println()\n",
" end\n",
"end\n",
"\n",
"setprecision(237) do\n",
" println(\"BigFloat($(precision(BigFloat)))\\t\", BigFloat(1//3))\n",
" println()\n",
"end\n",
"\n",
"println(\"BigFloat($(precision(BigFloat)))\\t\", BigFloat(1//3))"
]
},
{
"cell_type": "markdown",
"id": "5551f8cc",
"metadata": {},
"source": [
"BigFloatの仮数部分を他の型に合わせると, 少なくとも有理数・Rationalから変換した際の桁数は一致することが確かめられました. 完全には一致しないかもしれませんが, BigFloat(237)で概ねFloat256の目安にはなると思われます. 実際, 後ほどの固有値計算でもFloat128とBigFloat(113)は同じ行列サイズまで計算できるという結果を与えています. ただし, Float64の仮数に合わせたBigFloatで固有値を計算しても, Float64は[LinearAlgebra.jl](https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/)で, BigFloatは[GenericLinearAlgebra.jl](https://github.com/JuliaLinearAlgebra/GenericLinearAlgebra.jl)で解かれるため, 結果は一致しません. Float128とBigFloatは同じ[GenericLinearAlgebra.jl](https://github.com/JuliaLinearAlgebra/GenericLinearAlgebra.jl)で解かれるため, 概ね一致します."
]
},
{
"cell_type": "markdown",
"id": "89286f58",
"metadata": {},
"source": [
"## フランク行列\n",
"\n",
"[フランク行列](https://doi.org/10.1137/0106026)の実装例を示します. まず整数型で要素を作ってから, `datatype`キーワードで指定した型へ変換するようにしてあります."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "11913d65",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"5×5 Matrix{Int64}:\n",
" 5 4 3 2 1\n",
" 4 4 3 2 1\n",
" 3 3 3 2 1\n",
" 2 2 2 2 1\n",
" 1 1 1 1 1"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"function Frank(n; datatype=Int64)\n",
" A = zeros(datatype, n, n)\n",
" for j in 1:n\n",
" for i in 1:n\n",
" A[i,j] = convert(datatype, n - max(i,j) + 1)\n",
" end\n",
" end\n",
" return A\n",
"end\n",
"\n",
"A = Frank(5)"
]
},
{
"cell_type": "markdown",
"id": "c6d0b25f",
"metadata": {},
"source": [
"[幸谷智紀『LAPACK/BLAS入門』(森北出版, 2016)](https://www.morikita.co.jp/books/mid/084881) p.76 表4.7を参考に$n=10,100,1000$における最小固有値を計算します. 表4.7と同様に$n=1000$では単精度で計算した最小固有値が1桁も合わないことが確かめられます."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "4ce843be",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"n = 10 \tλ[1]\n",
"Float32 \t0.25567994\n",
"Float64 \t0.2556795627964355\n",
"\n",
"n = 100 \tλ[1]\n",
"Float32 \t0.2500508\n",
"Float64 \t0.2500610827206968\n",
"\n",
"n = 1000 \tλ[1]\n",
"Float32 \t0.12658979\n",
"Float64 \t0.2500006162349238\n"
]
}
],
"source": [
"# using LinearAlgebra\n",
"# using GenericLinearAlgebra\n",
"\n",
"for n in [10,100,1000]\n",
" println(\"\\nn = $n \\tλ[1]\")\n",
" for datatype in [Float32,Float64]#,Double64,Float128,BigFloat]\n",
" A = Frank(n, datatype=datatype)\n",
" λ = eigvals(A)\n",
" println(datatype, \" \\t\", λ[1])\n",
" end\n",
"end"
]
},
{
"cell_type": "markdown",
"id": "6e5b5e99",
"metadata": {},
"source": [
"Float32では1000×1000のフランク行列の最小固有値が1桁も正しく計算できないことが確かめられました. 実際には900×900程度から振動し始め, おかしな挙動を見せます. フランク行列だとメモリや計算時間の限界が先に来ますので, より悪条件なヒルベルト行列で単精度・倍精度・疑似四倍精度・四倍精度・八倍精度の限界を見てみます."
]
},
{
"cell_type": "markdown",
"id": "7a9282b5",
"metadata": {},
"source": [
"## ヒルベルト行列\n",
"\n",
"[ヒルベルト行列](https://doi.org/10.1007/BF02418278)の実装例を示します. まず有理数型で要素を作り, それを`datatype`キーワードで指定した型に変換します. `1 / (i + j - 1)`だと勝手にFloat64と解釈されて丸め誤差が発生してしまいますので, `1 // (i + j - 1)`とすることでRationalから他の型に直接変換し, 丸め誤差を小さくできます."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "45813ff7",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"5×5 Matrix{Rational}:\n",
" 1//1 1//2 1//3 1//4 1//5\n",
" 1//2 1//3 1//4 1//5 1//6\n",
" 1//3 1//4 1//5 1//6 1//7\n",
" 1//4 1//5 1//6 1//7 1//8\n",
" 1//5 1//6 1//7 1//8 1//9"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"function Hilbert(n; datatype=Rational)\n",
" A = zeros(datatype, n, n)\n",
" for j in 1:n\n",
" for i in 1:n\n",
" A[i,j] = convert(datatype, 1 // (i + j - 1))\n",
" end\n",
" end\n",
" return A\n",
"end\n",
"\n",
"H = Hilbert(5)"
]
},
{
"cell_type": "markdown",
"id": "7e5d6f31",
"metadata": {},
"source": [
"ヒルベルト行列のサイズを変えてFloat32, Float64, Double64, Float128, BigFloat(113), BigFloat(237), BigFloat(256)で固有値を計算します."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "d82e0a98",
"metadata": {
"scrolled": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"n = 2 \tλ[1]\n",
"Float32 \t6.574144959e-02\n",
"Float64 \t6.574145409e-02\n",
"Double64 \t6.574145409e-02\n",
"Float128 \t6.574145409e-02\n",
"BigFloat(113) \t6.574145409e-02\n",
"BigFloat(237) \t6.574145409e-02\n",
"BigFloat(256) \t6.574145409e-02\n",
"BigFloat(1024) \t6.574145409e-02\n",
"\n",
"n = 3 \tλ[1]\n",
"Float32 \t2.687271917e-03\n",
"Float64 \t2.687340356e-03\n",
"Double64 \t2.687340356e-03\n",
"Float128 \t2.687340356e-03\n",
"BigFloat(113) \t2.687340356e-03\n",
"BigFloat(237) \t2.687340356e-03\n",
"BigFloat(256) \t2.687340356e-03\n",
"BigFloat(1024) \t2.687340356e-03\n",
"\n",
"n = 4 \tλ[1]\n",
"Float32 \t9.672460874e-05\n",
"Float64 \t9.670230402e-05\n",
"Double64 \t9.670230402e-05\n",
"Float128 \t9.670230402e-05\n",
"BigFloat(113) \t9.670230402e-05\n",
"BigFloat(237) \t9.670230402e-05\n",
"BigFloat(256) \t9.670230402e-05\n",
"BigFloat(1024) \t9.670230402e-05\n",
"\n",
"n = 5 \tλ[1]\n",
"Float32 \t3.361120889e-06\n",
"Float64 \t3.287928772e-06\n",
"Double64 \t3.287928772e-06\n",
"Float128 \t3.287928772e-06\n",
"BigFloat(113) \t3.287928772e-06\n",
"BigFloat(237) \t3.287928772e-06\n",
"BigFloat(256) \t3.287928772e-06\n",
"BigFloat(1024) \t3.287928772e-06\n",
"\n",
"n = 6 \tλ[1]\n",
"Float32 \t1.306005259e-07\n",
"Float64 \t1.082799486e-07\n",
"Double64 \t1.082799485e-07\n",
"Float128 \t1.082799485e-07\n",
"BigFloat(113) \t1.082799485e-07\n",
"BigFloat(237) \t1.082799485e-07\n",
"BigFloat(256) \t1.082799485e-07\n",
"BigFloat(1024) \t1.082799485e-07\n",
"\n",
"n = 7 \tλ[1]\n",
"Float32 \t7.071950847e-08\n",
"Float64 \t3.493898716e-09\n",
"Double64 \t3.493898606e-09\n",
"Float128 \t3.493898606e-09\n",
"BigFloat(113) \t3.493898606e-09\n",
"BigFloat(237) \t3.493898606e-09\n",
"BigFloat(256) \t3.493898606e-09\n",
"BigFloat(1024) \t3.493898606e-09\n",
"\n",
"n = 8 \tλ[1]\n",
"Float32 \t-4.826334887e-08\n",
"Float64 \t1.111539635e-10\n",
"Double64 \t1.111538966e-10\n",
"Float128 \t1.111538966e-10\n",
"BigFloat(113) \t1.111538966e-10\n",
"BigFloat(237) \t1.111538966e-10\n",
"BigFloat(256) \t1.111538966e-10\n",
"BigFloat(1024) \t1.111538966e-10\n",
"\n",
"n = 9 \tλ[1]\n",
"Float32 \t2.470161853e-09\n",
"Float64 \t3.499736372e-12\n",
"Double64 \t3.499676403e-12\n",
"Float128 \t3.499676403e-12\n",
"BigFloat(113) \t3.499676403e-12\n",
"BigFloat(237) \t3.499676403e-12\n",
"BigFloat(256) \t3.499676403e-12\n",
"BigFloat(1024) \t3.499676403e-12\n",
"\n",
"n = 10 \tλ[1]\n",
"Float32 \t-6.898610394e-08\n",
"Float64 \t1.093880753e-13\n",
"Double64 \t1.093153819e-13\n",
"Float128 \t1.093153819e-13\n",
"BigFloat(113) \t1.093153819e-13\n",
"BigFloat(237) \t1.093153819e-13\n",
"BigFloat(256) \t1.093153819e-13\n",
"BigFloat(1024) \t1.093153819e-13\n",
"\n",
"n = 11 \tλ[1]\n",
"Float32 \t-8.925326256e-08\n",
"Float64 \t3.375236254e-15\n",
"Double64 \t3.393218595e-15\n",
"Float128 \t3.393218595e-15\n",
"BigFloat(113) \t3.393218595e-15\n",
"BigFloat(237) \t3.393218595e-15\n",
"BigFloat(256) \t3.393218595e-15\n",
"BigFloat(1024) \t3.393218595e-15\n",
"\n",
"n = 12 \tλ[1]\n",
"Float32 \t-3.159367168e-08\n",
"Float64 \t1.630874793e-16\n",
"Double64 \t1.047946398e-16\n",
"Float128 \t1.047946398e-16\n",
"BigFloat(113) \t1.047946398e-16\n",
"BigFloat(237) \t1.047946398e-16\n",
"BigFloat(256) \t1.047946398e-16\n",
"BigFloat(1024) \t1.047946398e-16\n",
"\n",
"n = 13 \tλ[1]\n",
"Float32 \t-6.428467003e-08\n",
"Float64 \t-3.723992712e-17\n",
"Double64 \t3.222901015e-18\n",
"Float128 \t3.222901015e-18\n",
"BigFloat(113) \t3.222901015e-18\n",
"BigFloat(237) \t3.222901015e-18\n",
"BigFloat(256) \t3.222901015e-18\n",
"BigFloat(1024) \t3.222901015e-18\n",
"\n",
"n = 14 \tλ[1]\n",
"Float32 \t-2.169667468e-08\n",
"Float64 \t6.961897267e-18\n",
"Double64 \t9.877051735e-20\n",
"Float128 \t9.877051735e-20\n",
"BigFloat(113) \t9.877051735e-20\n",
"BigFloat(237) \t9.877051735e-20\n",
"BigFloat(256) \t9.877051735e-20\n",
"BigFloat(1024) \t9.877051735e-20\n",
"\n",
"n = 15 \tλ[1]\n",
"Float32 \t-5.260721281e-08\n",
"Float64 \t-1.970813429e-17\n",
"Double64 \t3.017915296e-21\n",
"Float128 \t3.017915296e-21\n",
"BigFloat(113) \t3.017915296e-21\n",
"BigFloat(237) \t3.017915296e-21\n",
"BigFloat(256) \t3.017915296e-21\n",
"BigFloat(1024) \t3.017915296e-21\n",
"\n",
"n = 16 \tλ[1]\n",
"Float32 \t-2.375249863e-08\n",
"Float64 \t-7.991212304e-17\n",
"Double64 \t1.971645939e-22\n",
"Float128 \t9.197419821e-23\n",
"BigFloat(113) \t9.197419821e-23\n",
"BigFloat(237) \t9.197419821e-23\n",
"BigFloat(256) \t9.197419821e-23\n",
"BigFloat(1024) \t9.197419821e-23\n",
"\n",
"n = 17 \tλ[1]\n",
"Float32 \t-7.643087940e-08\n",
"Float64 \t-3.084917681e-17\n",
"Double64 \t2.796721645e-24\n",
"Float128 \t2.796721632e-24\n",
"BigFloat(113) \t2.796721632e-24\n",
"BigFloat(237) \t2.796721632e-24\n",
"BigFloat(256) \t2.796721632e-24\n",
"BigFloat(1024) \t2.796721632e-24\n",
"\n",
"n = 18 \tλ[1]\n",
"Float32 \t-3.352907640e-08\n",
"Float64 \t-2.922712756e-17\n",
"Double64 \t1.804347127e-25\n",
"Float128 \t8.487413205e-26\n",
"BigFloat(113) \t8.487413251e-26\n",
"BigFloat(237) \t8.487413230e-26\n",
"BigFloat(256) \t8.487413230e-26\n",
"BigFloat(1024) \t8.487413230e-26\n",
"\n",
"n = 19 \tλ[1]\n",
"Float32 \t-2.656284792e-08\n",
"Float64 \t-1.790409527e-17\n",
"Double64 \t3.747605947e-27\n",
"Float128 \t3.747605889e-27\n",
"BigFloat(113) \t3.747605889e-27\n",
"BigFloat(237) \t2.571239705e-27\n",
"BigFloat(256) \t2.571239705e-27\n",
"BigFloat(1024) \t2.571239705e-27\n",
"\n",
"n = 20 \tλ[1]\n",
"Float32 \t-2.125183407e-08\n",
"Float64 \t-2.797264572e-16\n",
"Double64 \t-1.905490628e-25\n",
"Float128 \t1.233279673e-28\n",
"BigFloat(113) \t1.233279672e-28\n",
"BigFloat(237) \t7.777377397e-29\n",
"BigFloat(256) \t7.777377397e-29\n",
"BigFloat(1024) \t7.777377397e-29\n",
"\n",
"n = 21 \tλ[1]\n",
"Float32 \t-5.624558241e-08\n",
"Float64 \t-1.348650289e-17\n",
"Double64 \t-2.714301231e-23\n",
"Float128 \t3.678871784e-30\n",
"BigFloat(113) \t3.678871738e-30\n",
"BigFloat(237) \t2.349182538e-30\n",
"BigFloat(256) \t2.349182538e-30\n",
"BigFloat(1024) \t2.349182538e-30\n",
"\n",
"n = 22 \tλ[1]\n",
"Float32 \t-9.640318410e-08\n",
"Float64 \t-1.951507152e-16\n",
"Double64 \t-2.703080487e-30\n",
"Float128 \t-2.703634429e-30\n",
"BigFloat(113) \t-2.703493685e-30\n",
"BigFloat(237) \t7.086841582e-32\n",
"BigFloat(256) \t7.086841582e-32\n",
"BigFloat(1024) \t7.086841582e-32\n",
"\n",
"n = 23 \tλ[1]\n",
"Float32 \t-1.269348502e-07\n",
"Float64 \t-1.840525074e-16\n",
"Double64 \t-1.395136546e-26\n",
"Float128 \t-1.395136676e-26\n",
"BigFloat(113) \t-1.395136676e-26\n",
"BigFloat(237) \t2.135463438e-33\n",
"BigFloat(256) \t2.135463438e-33\n",
"BigFloat(1024) \t2.135463438e-33\n",
"\n",
"n = 24 \tλ[1]\n",
"Float32 \t-5.662219849e-08\n",
"Float64 \t-1.536692857e-16\n",
"Double64 \t-6.294692986e-25\n",
"Float128 \t-6.294693199e-25\n",
"BigFloat(113) \t-6.294693199e-25\n",
"BigFloat(237) \t6.428050325e-35\n",
"BigFloat(256) \t6.428050325e-35\n",
"BigFloat(1024) \t6.428050325e-35\n",
"\n",
"n = 25 \tλ[1]\n",
"Float32 \t-9.834495529e-08\n",
"Float64 \t-1.158400702e-16\n",
"Double64 \t-9.786027902e-24\n",
"Float128 \t3.475441588e-36\n",
"BigFloat(113) \t7.644790285e-36\n",
"BigFloat(237) \t1.933092811e-36\n",
"BigFloat(256) \t1.933092811e-36\n",
"BigFloat(1024) \t1.933092811e-36\n",
"\n",
"n = 26 \tλ[1]\n",
"Float32 \t-8.256382245e-08\n",
"Float64 \t-1.745151220e-16\n",
"Double64 \t-9.530727276e-23\n",
"Float128 \t-3.164126445e-29\n",
"BigFloat(113) \t-3.164126445e-29\n",
"BigFloat(237) \t5.808263531e-38\n",
"BigFloat(256) \t5.808263531e-38\n",
"BigFloat(1024) \t5.808263531e-38\n",
"\n",
"n = 27 \tλ[1]\n",
"Float32 \t-1.064803996e-07\n",
"Float64 \t-3.982034472e-17\n",
"Double64 \t-2.741452206e-34\n",
"Float128 \t-1.690538599e-27\n",
"BigFloat(113) \t-1.690538566e-27\n",
"BigFloat(237) \t1.743773036e-39\n",
"BigFloat(256) \t1.743773036e-39\n",
"BigFloat(1024) \t1.743773036e-39\n",
"\n",
"n = 28 \tλ[1]\n",
"Float32 \t-8.385376304e-08\n",
"Float64 \t-3.548348719e-16\n",
"Double64 \t-1.942851263e-34\n",
"Float128 \t-2.793018499e-26\n",
"BigFloat(113) \t-2.793018543e-26\n",
"BigFloat(237) \t5.231305960e-41\n",
"BigFloat(256) \t5.231305960e-41\n",
"BigFloat(1024) \t5.231305960e-41\n",
"\n",
"n = 29 \tλ[1]\n",
"Float32 \t-1.160353804e-07\n",
"Float64 \t-6.199549616e-17\n",
"Double64 \t2.031105710e-35\n",
"Float128 \t-2.822973889e-25\n",
"BigFloat(113) \t-2.822973889e-25\n",
"BigFloat(237) \t1.568304495e-42\n",
"BigFloat(256) \t1.568304495e-42\n",
"BigFloat(1024) \t1.568304495e-42\n",
"\n",
"n = 30 \tλ[1]\n",
"Float32 \t-5.065797382e-08\n",
"Float64 \t-5.289389957e-17\n",
"Double64 \t-1.102984167e-34\n",
"Float128 \t-2.146399554e-24\n",
"BigFloat(113) \t-2.146399554e-24\n",
"BigFloat(237) \t4.698636565e-44\n",
"BigFloat(256) \t4.698636565e-44\n",
"BigFloat(1024) \t4.698636565e-44\n",
"\n",
"n = 31 \tλ[1]\n",
"Float32 \t-7.601806118e-08\n",
"Float64 \t-7.329944390e-17\n",
"Double64 \t-2.916407324e-34\n",
"Float128 \t-1.255399562e-26\n",
"BigFloat(113) \t-1.255399570e-26\n",
"BigFloat(237) \t1.406868380e-45\n",
"BigFloat(256) \t1.406868380e-45\n",
"BigFloat(1024) \t1.406868380e-45\n",
"\n",
"n = 32 \tλ[1]\n",
"Float32 \t-6.675949749e-08\n",
"Float64 \t-2.374236517e-17\n",
"Double64 \t-1.025210430e-34\n",
"Float128 \t-3.257260908e-36\n",
"BigFloat(113) \t-3.257273966e-36\n",
"BigFloat(237) \t4.210099032e-47\n",
"BigFloat(256) \t4.210099032e-47\n",
"BigFloat(1024) \t4.210099032e-47\n",
"\n",
"n = 33 \tλ[1]\n",
"Float32 \t-3.973698597e-07\n",
"Float64 \t-6.226694532e-16\n",
"Double64 \t7.189674368e-35\n",
"Float128 \t-2.193596836e-36\n",
"BigFloat(113) \t-2.196233777e-36\n",
"BigFloat(237) \t1.259226436e-48\n",
"BigFloat(256) \t1.259226436e-48\n",
"BigFloat(1024) \t1.259226436e-48\n",
"\n",
"n = 34 \tλ[1]\n",
"Float32 \t-2.554655794e-07\n",
"Float64 \t-3.335481520e-16\n",
"Double64 \t-3.129737789e-34\n",
"Float128 \t-1.976669371e-25\n",
"BigFloat(113) \t-1.976656135e-25\n",
"BigFloat(237) \t3.764454505e-50\n",
"BigFloat(256) \t3.764454505e-50\n",
"BigFloat(1024) \t3.764454505e-50\n",
"\n",
"n = 35 \tλ[1]\n",
"Float32 \t-2.991216945e-07\n",
"Float64 \t-5.137067885e-16\n",
"Double64 \t-2.859278165e-34\n",
"Float128 \t-2.330807894e-36\n",
"BigFloat(113) \t-4.558199889e-36\n",
"BigFloat(237) \t2.318729279e-51\n",
"BigFloat(256) \t1.124863268e-51\n",
"BigFloat(1024) \t1.124863268e-51\n",
"\n",
"n = 36 \tλ[1]\n",
"Float32 \t-1.958887168e-07\n",
"Float64 \t-4.311007920e-16\n",
"Double64 \t-2.255124275e-34\n",
"Float128 \t-3.238481422e-36\n",
"BigFloat(113) \t-3.284703834e-36\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"BigFloat(237) \t6.919587088e-53\n",
"BigFloat(256) \t3.359762135e-53\n",
"BigFloat(1024) \t3.359762135e-53\n",
"\n",
"n = 37 \tλ[1]\n",
"Float32 \t-1.831468239e-07\n",
"Float64 \t-6.535979384e-16\n",
"Double64 \t-2.116415176e-34\n",
"Float128 \t-4.635128397e-36\n",
"BigFloat(113) \t-4.608945882e-36\n",
"BigFloat(237) \t2.064207412e-54\n",
"BigFloat(256) \t1.003088162e-54\n",
"BigFloat(1024) \t1.003088162e-54\n",
"\n",
"n = 38 \tλ[1]\n",
"Float32 \t-3.336613190e-07\n",
"Float64 \t-4.512357590e-16\n",
"Double64 \t-4.432253018e-34\n",
"Float128 \t-7.250642462e-36\n",
"BigFloat(113) \t-4.329796608e-36\n",
"BigFloat(237) \t4.617382896e-56\n",
"BigFloat(256) \t6.155701539e-56\n",
"BigFloat(1024) \t2.993651511e-56\n",
"\n",
"n = 39 \tλ[1]\n",
"Float32 \t-2.744432948e-07\n",
"Float64 \t-5.247024262e-16\n",
"Double64 \t-1.290923053e-34\n",
"Float128 \t-3.133177866e-36\n",
"BigFloat(113) \t-3.137829276e-36\n",
"BigFloat(237) \t1.376487504e-57\n",
"BigFloat(256) \t1.835102354e-57\n",
"BigFloat(1024) \t8.931079340e-58\n",
"\n",
"n = 40 \tλ[1]\n",
"Float32 \t-2.591677912e-07\n",
"Float64 \t-4.441535386e-16\n",
"Double64 \t-4.783821138e-34\n",
"Float128 \t-1.174660257e-36\n",
"BigFloat(113) \t-1.174234562e-36\n",
"BigFloat(237) \t3.420000952e-59\n",
"BigFloat(256) \t4.102179409e-59\n",
"BigFloat(1024) \t2.663517273e-59\n"
]
}
],
"source": [
"# using LinearAlgebra\n",
"# using GenericLinearAlgebra\n",
"\n",
"for n in 2:40\n",
" println(\"\\nn = $n \\tλ[1]\")\n",
" for datatype in [Float32, Float64, Double64, Float128]\n",
" A = Hilbert(n, datatype=datatype)\n",
" λ = eigvals(A)\n",
" @printf(\"%s \\t%.9e\\n\", datatype, λ[1])\n",
" end\n",
"\n",
" for p in [113,237,256,1024]\n",
" setprecision(p) do\n",
" A = Hilbert(n, datatype=BigFloat)\n",
" λ = eigvals(A)\n",
" @printf(\"%s \\t%.9e\\n\", \"BigFloat($(precision(BigFloat)))\", λ[1])\n",
" end\n",
" end\n",
"\n",
"end\n",
"\n",
"# setprecision(BigFloat, 256)"
]
},
{
"cell_type": "markdown",
"id": "8ed6ccc4",
"metadata": {},
"source": [
"固有値が正しく計算できなくなったサイズを表にまとめました. Float32は$n=6\\rightarrow7$で1桁も合わなくなり, Float64は$n=12\\rightarrow13$で1桁も合わなくなりました. Float128とBigFloat(113)は同じサイズで固有値が計算できなくなっているので, BigFloat(237)はFloat256の目安になると思われます. BigFloat(256)はデフォルトのBigFloatです. なお, BigFloat(1024)より上は計算しておらず, 正しさが保証できないため, 表には載せていません.\n",
"\n",
"|型|固有値が1桁も正しく計算できなくなるサイズ|\n",
"|:---|:---|\n",
"|Float32|$$n=6\\rightarrow7$$|\n",
"|Float64|$$n=12\\rightarrow13$$|\n",
"|Double64|$$n=17\\rightarrow18$$|\n",
"|Float128|$$n=18\\rightarrow19$$|\n",
"|BigFloat(113)|$$n=18\\rightarrow19$$|\n",
"|BigFloat(237)|$$n=34\\rightarrow35$$|\n",
"|BigFloat(256)|$$n=37\\rightarrow38$$|"
]
},
{
"cell_type": "markdown",
"id": "63bf3206",
"metadata": {},
"source": [
"## 条件数からの考察\n",
"\n",
"条件数の定義は[Wikipedia](https://ja.wikipedia.org/wiki/%E6%9D%A1%E4%BB%B6%E6%95%B0)に譲ります. 小さいときが良条件 (well-conditioned)で, 大きいときが悪条件 (ill-conditioned)です. 条件数そのものは行列の性質であり, 問題によって許容される条件数が異なるものだと認識しています. 行列の条件数は`cond()`で求められます. これは最大特異値÷最小特異値の値とほぼ一致しますので, 恐らく特異値分解から求めているものと思われます."
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "0159f831",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"n = 5\n",
"\n",
"Frank\n",
"cond(A) = 45.45516413147919\n",
"maximum(s) / minimum(s) = 45.45516413147921\n",
"\n",
"Hilbert\n",
"cond(A) = 476607.2502419338\n",
"maximum(s) / minimum(s) = 476607.2502419339\n"
]
},
{
"data": {
"text/plain": [
"476607.2502419339"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"n = 5\n",
"@show n\n",
"\n",
"println(\"\\nFrank\")\n",
"A = Frank(n)\n",
"U, s, V = svd(A)\n",
"@show cond(A)\n",
"@show maximum(s) / minimum(s)\n",
"\n",
"println(\"\\nHilbert\")\n",
"A = Hilbert(n)\n",
"U, s, V = svd(A)\n",
"@show cond(A)\n",
"@show maximum(s) / minimum(s)"
]
},
{
"cell_type": "markdown",
"id": "ad37205a",
"metadata": {},
"source": [
"フランク行列と比べるとヒルベルト行列では爆発的に条件数が増加していることがわかります. なお, Float64で条件数を計算しているので, ヒルベルト行列に関してはほとんど正しく計算できていないと思います."
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "2bb0111e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" size \tFrank \tHilbert\n",
" 1²\t1.00 \t1.00 \n",
" 2²\t6.85 \t19.28 \n",
" 3²\t16.39 \t524.06 \n",
" 4²\t29.28 \t15513.74 \n",
" 5²\t45.46 \t476607.25 \n",
" 6²\t64.89 \t14951058.64 \n",
" 7²\t87.57 \t475367356.88\n",
" 8²\t113.50 \t15257575568.35\n",
" 9²\t142.67 \t493153322841.38\n",
" 10²\t175.09 \t16024930538618.01\n",
" 11²\t210.75 \t522477970426706.44\n",
" 12²\t249.65 \t16425825843167240.00\n",
" 13²\t291.80 \t4469107228796737536.00\n",
" 14²\t337.19 \t321378772096752768.00\n",
" 15²\t385.82 \t336592127973880256.00\n",
" 16²\t437.70 \t2217922700925939456.00\n",
" 17²\t492.82 \t983088859604528640.00\n",
" 18²\t551.18 \t2586326227549559808.00\n",
" 19²\t612.78 \t3422847393583360000.00\n",
" 20²\t677.62 \t6806966466008103936.00\n",
" 21²\t745.71 \t13487063198863669248.00\n",
" 22²\t817.04 \t11232775235552192512.00\n",
" 23²\t891.61 \t1132270328755750272.00\n",
" 24²\t969.43 \t1317604681202903296.00\n",
" 25²\t1050.48 \t7994506305529366708224.00\n",
" 26²\t1134.78 \t5787890769456193536.00\n",
" 27²\t1222.32 \t8174081571849901056.00\n",
" 28²\t1313.11 \t2009673932453556518912.00\n",
" 29²\t1407.13 \t380731206983849017344.00\n",
" 30²\t1504.40 \t47285141654614188032.00\n",
" 31²\t1604.91 \t3672802249571561472.00\n",
" 32²\t1708.66 \t7566357653338823680.00\n",
" 33²\t1815.66 \t9488563366583488512.00\n",
" 34²\t1925.90 \t5000159154944207872.00\n",
" 35²\t2039.38 \t14955514692108713984.00\n",
" 36²\t2156.10 \t12711285908062918656.00\n",
" 37²\t2276.06 \t16995276022847053824.00\n",
" 38²\t2399.27 \t54971576325829861376.00\n",
" 39²\t2525.72 \t13312249204127688704.00\n",
" 40²\t2655.41 \t5697452796254398464.00\n",
" 100²\t16370.24 \t33418526793170358272.00\n",
" 800²\t1038822.57 \t489630054495249432576.00\n",
" 900²\t1314578.30 \t1104160507815823409152.00\n",
"1000²\t1622756.82 \t1292840854143322816512.00\n"
]
}
],
"source": [
"println(\" size \\tFrank \\tHilbert\")\n",
"for i in [1:40..., 100, 800, 900, 1000]\n",
" cond₁ = cond(Frank(i, datatype=Float64))\n",
" cond₂ = cond(Hilbert(i, datatype=Float64))\n",
" @printf(\"%4d²\\t%-12.2f\\t%-12.2f\\n\", i, cond₁, cond₂)\n",
"end"
]
},
{
"cell_type": "markdown",
"id": "e0ac1c9a",
"metadata": {},
"source": [
"条件数から何か考察したいと思いましたが, 固有値の精度との関係をまだ理解していません. Float32がフランク行列の固有値を正しく計算できたのは900×900程までで, ヒルベルト行列が正しく計算できなくなったのは概ね5×5~7×7までです. 条件数が大体同じくらいの桁なので, 何か関係がありそうです. 現在, 下記のような本を読んで勉強中しています. せっかくなのでアドベントカレンダーに載せて有識者にコメントを頂きたいと思っています.\n",
"\n",
"- [菊地文雄『数値計算の誤差と精度』(丸善出版, 2022)](https://www.maruzen-publishing.co.jp/item/b304781.html)\n",
"- [二宮市三, 吉田年雄, 長谷川武光, 秦野世, 杉浦洋, 櫻井鉄也『数値計算のつぼ』(共立出版, 2004)](https://www.kyoritsu-pub.co.jp/book/b10010279.html)\n",
"- [日本応用数理学会 監修・櫻井鉄也・松尾宇泰・片桐孝洋 編『数値線形代数の数理とHPC』(共立出版, 2018)](https://www.kyoritsu-pub.co.jp/book/b10003102.html)\n",
"- [F.シャトラン著, 伊理正夫・伊理由美訳『行列の固有値 新装版 最新の解法と応用』(丸善出版, 2012)](https://www.maruzen-publishing.co.jp/item/b294297.html)"
]
},
{
"cell_type": "markdown",
"id": "aa7d2dd2",
"metadata": {},
"source": [
"## まとめ\n",
"\n",
"単精度・倍精度・疑似四倍精度・四倍精度・八倍精度で[フランク行列](https://doi.org/10.1137/0106026)・[ヒルベルト行列](https://doi.org/10.1007/BF02418278)の固有値を求めました. フランク行列の場合, 1000×1000程のサイズで単精度では固有値が1桁も正しく計算できなくなることを確かめました. より[条件数](https://ja.wikipedia.org/wiki/%E6%9D%A1%E4%BB%B6%E6%95%B0#%E8%A1%8C%E5%88%97%E3%81%AE%E6%9D%A1%E4%BB%B6%E6%95%B0)の大きい[ヒルベルト行列](https://ja.wikipedia.org/wiki/%E3%83%92%E3%83%AB%E3%83%99%E3%83%AB%E3%83%88%E8%A1%8C%E5%88%97)の場合, 下表のような結果になりました.\n",
"\n",
"|型|固有値が1桁も正しく計算できなくなるサイズ|\n",
"|:---|:---|\n",
"|Float32|$$n=6\\rightarrow7$$|\n",
"|Float64|$$n=12\\rightarrow13$$|\n",
"|Double64|$$n=17\\rightarrow18$$|\n",
"|Float128|$$n=18\\rightarrow19$$|\n",
"|BigFloat(113)|$$n=18\\rightarrow19$$|\n",
"|BigFloat(237)|$$n=34\\rightarrow35$$|\n",
"|BigFloat(256)|$$n=37\\rightarrow38$$|\n",
"\n",
"単精度がいくら速かろうと, 間違った結果を与えるのではお話になりませんので, 倍精度が使われます. 同じ理由で, 倍精度より遅かろうと四倍精度, 八倍精度を必要とする問題がありますので, より発展していくと嬉しいですね. この記事の執筆を通して, Juliaの高精度計算のポテンシャルが感じられました."
]
},
{
"cell_type": "markdown",
"id": "9e28c7c2",
"metadata": {},
"source": [
"## 付録:メモリ消費量\n",
"\n",
"各要素のメモリ消費量をそのまま要素数にかけているだけですので, 実際にはもう少し大きい値になります. 正確な値は`varinfo()`で確認できます."
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "7619f6b2",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"size\t\t Float32 \t Float64 \tFloat128 \tFloat256 \t\n",
"1²\t\t 4 B\t 8 B\t 16 B\t 32 B\t\n",
"10²\t\t 400 B\t 800 B\t 2 KB\t 3 KB\t\n",
"100²\t\t 40 KB\t 80 KB\t 160 KB\t 320 KB\t\n",
"1000²\t\t 4 MB\t 8 MB\t 16 MB\t 32 MB\t\n",
"2000²\t\t 16 MB\t 32 MB\t 64 MB\t 128 MB\t\n",
"3000²\t\t 36 MB\t 72 MB\t 144 MB\t 288 MB\t\n",
"4000²\t\t 64 MB\t 128 MB\t 256 MB\t 512 MB\t\n",
"5000²\t\t 100 MB\t 200 MB\t 400 MB\t 800 MB\t\n",
"6000²\t\t 144 MB\t 288 MB\t 576 MB\t 1.2 GB\t\n",
"7000²\t\t 196 MB\t 392 MB\t 784 MB\t 1.6 GB\t\n",
"8000²\t\t 256 MB\t 512 MB\t 1.0 GB\t 2.0 GB\t\n",
"9000²\t\t 324 MB\t 648 MB\t 1.3 GB\t 2.6 GB\t\n",
"10000²\t\t 400 MB\t 800 MB\t 1.6 GB\t 3.2 GB\t\n",
"11000²\t\t 484 MB\t 968 MB\t 1.9 GB\t 3.9 GB\t\n",
"12000²\t\t 576 MB\t 1.2 GB\t 2.3 GB\t 4.6 GB\t\n",
"13000²\t\t 676 MB\t 1.4 GB\t 2.7 GB\t 5.4 GB\t\n",
"14000²\t\t 784 MB\t 1.6 GB\t 3.1 GB\t 6.3 GB\t\n",
"15000²\t\t 900 MB\t 1.8 GB\t 3.6 GB\t 7.2 GB\t\n",
"16000²\t\t 1.0 GB\t 2.0 GB\t 4.1 GB\t 8.2 GB\t\n",
"17000²\t\t 1.2 GB\t 2.3 GB\t 4.6 GB\t 9.2 GB\t\n",
"18000²\t\t 1.3 GB\t 2.6 GB\t 5.2 GB\t 10.4 GB\t\n",
"19000²\t\t 1.4 GB\t 2.9 GB\t 5.8 GB\t 11.6 GB\t\n",
"20000²\t\t 1.6 GB\t 3.2 GB\t 6.4 GB\t 12.8 GB\t\n",
"30000²\t\t 3.6 GB\t 7.2 GB\t 14.4 GB\t 28.8 GB\t\n",
"40000²\t\t 6.4 GB\t 12.8 GB\t 25.6 GB\t 51.2 GB\t\n",
"50000²\t\t 10.0 GB\t 20.0 GB\t 40.0 GB\t 80.0 GB\t\n",
"60000²\t\t 14.4 GB\t 28.8 GB\t 57.6 GB\t115.2 GB\t\n",
"70000²\t\t 19.6 GB\t 39.2 GB\t 78.4 GB\t156.8 GB\t\n",
"80000²\t\t 25.6 GB\t 51.2 GB\t102.4 GB\t204.8 GB\t\n",
"90000²\t\t 32.4 GB\t 64.8 GB\t129.6 GB\t259.2 GB\t\n",
"100000²\t\t 40.0 GB\t 80.0 GB\t160.0 GB\t320.0 GB\t\n"
]
}
],
"source": [
"# using Printf\n",
"\n",
"function KMG(byte)\n",
" if byte > 10^9\n",
" return @sprintf(\"%5.1f GB\", byte/10^9)\n",
" elseif byte > 10^6\n",
" return @sprintf(\"%5.0f MB\", byte/10^6)\n",
" elseif byte > 10^3\n",
" return @sprintf(\"%5.0f KB\", byte/10^3)\n",
" else\n",
" return @sprintf(\"%5.0f B\", byte)\n",
" end\n",
"end\n",
"\n",
"\n",
"println(\"size\\t\\t\", [@sprintf(\"%9s\", \"Float$j \\t\") for j in [32,64,128,256]]...)\n",
"for i in union(1,10,100,1000:1000:20000, 30000:10000:100000)\n",
" println(\"$(i)²\\t\\t\", [KMG(i^2*j/8)*\"\\t\" for j in [32,64,128,256]]...)\n",
"end"
]
},
{
"cell_type": "markdown",
"id": "880f2da3",
"metadata": {},
"source": [
"## 参考文献\n",
"\n",
"- W. L. Frank, Jpn J Ind Appl Math, 6, 378 (1958) https://doi.org/10.1137/0106026\n",
"- D. Hilbert, Acta Math., 18, 155 (1894) https://doi.org/10.1007/BF02418278\n",
"- [幸谷智紀『LAPACK/BLAS入門』(森北出版, 2016)](https://www.morikita.co.jp/books/mid/084881)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Julia 1.8.3",
"language": "julia",
"name": "julia-1.8"
},
"language_info": {
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
"version": "1.8.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment