Skip to content

Instantly share code, notes, and snippets.

@stepwise-ai-dev
Created August 11, 2023 20:06
Show Gist options
  • Save stepwise-ai-dev/f23a79faaedd006bf51d486259440dd5 to your computer and use it in GitHub Desktop.
Save stepwise-ai-dev/f23a79faaedd006bf51d486259440dd5 to your computer and use it in GitHub Desktop.
$ conda activate myenv
$ RUST_BACKTRACE=full tokenise_bio -i /data/ncbi_dataset/GCF_000001405.39_GRCh38.p13_genomic.fasta -t '/data/generated/ncbi_tokenisers/tokeniser_39_GRCh38.json'
COMMAND LINE ARGUMENTS FOR REPRODUCIBILITY:
/home/ec2-user/mambaforge/envs/myenv/bin/tokenise_bio -i /data/ncbi_dataset/GCF_000001405.39_GRCh38.p13_genomic.fasta -t /data/generated/ncbi_tokenisers/tokeniser_39_GRCh38.json
[00:01:25] Pre-processing sequences ███████████████████████████████████████████████████████████████████████████████████████████████ 0 / 0
[00:00:00] Suffix array seeds ███████████████████████████████████████████████████████████████████████████████████████████████ 0 / 0
thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: Internal', tokenizers-lib/src/models/unigram/trainer.rs:212:53
stack backtrace:
0: 0x7f08eb392d2d - std::backtrace_rs::backtrace::libunwind::trace::h22893a5306c091b4
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
1: 0x7f08eb392d2d - std::backtrace_rs::backtrace::trace_unsynchronized::h29c3bc6f9e91819d
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
2: 0x7f08eb392d2d - std::sys_common::backtrace::_print_fmt::he497d8a0ec903793
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:66:5
3: 0x7f08eb392d2d - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h9c2a9d2774d81873
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:45:22
4: 0x7f08eb3b4a0c - core::fmt::write::hba4337c43d992f49
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/fmt/mod.rs:1194:17
5: 0x7f08eb38d7b1 - std::io::Write::write_fmt::heb73de6e02cfabed
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/io/mod.rs:1655:15
6: 0x7f08eb394855 - std::sys_common::backtrace::_print::h63c8b24acdd8e8ce
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:48:5
7: 0x7f08eb394855 - std::sys_common::backtrace::print::h426700d6240cdcc2
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:35:9
8: 0x7f08eb394855 - std::panicking::default_hook::{{closure}}::hc9a76eed0b18f82b
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:295:22
9: 0x7f08eb394509 - std::panicking::default_hook::h2e88d02087fae196
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:314:9
10: 0x7f08eb394da2 - std::panicking::rust_panic_with_hook::habfdcc2e90f9fd4c
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:698:17
11: 0x7f08eb394c87 - std::panicking::begin_panic_handler::{{closure}}::he054b2a83a51d2cd
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:588:13
12: 0x7f08eb3931e4 - std::sys_common::backtrace::__rust_end_short_backtrace::ha48b94ab49b30915
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:138:18
13: 0x7f08eb3949b9 - rust_begin_unwind
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:584:5
14: 0x7f08eae40153 - core::panicking::panic_fmt::h366d3a309ae17c94
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/panicking.rs:143:14
15: 0x7f08eae40243 - core::result::unwrap_failed::hddd78f4658ac7d0f
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/result.rs:1785:5
16: 0x7f08eb0a5748 - tokenizers::models::unigram::trainer::UnigramTrainer::do_train::h82b421a34bd7ad59
17: 0x7f08eb02c81f - <tokenizers::models::TrainerWrapper as tokenizers::tokenizer::Trainer>::train::hf338d3868b92a1a9
18: 0x7f08eaecb3f1 - <tokenizers::trainers::PyTrainer as tokenizers::tokenizer::Trainer>::train::h797d6fbe1484d439
19: 0x7f08eaeeeabe - tokenizers::tokenizer::TokenizerImpl<M,N,PT,PP,D>::train::hc262ae02e631cc04
20: 0x7f08eaf55ca3 - tokenizers::utils::iter::ResultShunt<I,E>::process::h9b0c24be256be1db
21: 0x7f08eafefc00 - <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once::hf2eb4b8dad41f8fc
22: 0x7f08eafb0517 - pyo3::python::Python::allow_threads::hfe4597fb821e57cf
23: 0x7f08eaf43e85 - tokenizers::tokenizer::PyTokenizer::train_from_iterator::h0d94a2269a130013
24: 0x7f08eae9432a - std::panicking::try::h37eed0e7d72cc93a
25: 0x7f08eaf4c3ca - tokenizers::tokenizer::__init4851609660326938678::__wrap::h19ece55281e9215f
26: 0x55afd68ee79c - cfunction_call
at /usr/local/src/conda/python-3.9.15/Objects/methodobject.c:543:19
27: 0x55afd68d4db7 - _PyObject_MakeTpCall
at /usr/local/src/conda/python-3.9.15/Objects/call.c:191:18
28: 0x55afd68d0e62 - _PyObject_VectorcallTstate
at /usr/local/src/conda/python-3.9.15/Include/cpython/abstract.h:116:16
29: 0x55afd68d0e62 - _PyObject_VectorcallTstate
at /usr/local/src/conda/python-3.9.15/Include/cpython/abstract.h:103:1
30: 0x55afd68d0e62 - PyObject_Vectorcall
at /usr/local/src/conda/python-3.9.15/Include/cpython/abstract.h:127:12
31: 0x55afd68d0e62 - call_function
at /usr/local/src/conda/python-3.9.15/Python/ceval.c:5077:13
32: 0x55afd68d0e62 - _PyEval_EvalFrameDefault
at /usr/local/src/conda/python-3.9.15/Python/ceval.c:3537:19
33: 0x55afd68ca8b7 - _PyEval_EvalFrame
at /usr/local/src/conda/python-3.9.15/Include/internal/pycore_ceval.h:40:12
34: 0x55afd68ca8b7 - _PyEval_EvalCode
at /usr/local/src/conda/python-3.9.15/Python/ceval.c:4329:14
35: 0x55afd68ec198 - _PyFunction_Vectorcall
at /usr/local/src/conda/python-3.9.15/Objects/call.c:396:12
36: 0x55afd68ec198 - _PyObject_VectorcallTstate
at /usr/local/src/conda/python-3.9.15/Include/cpython/abstract.h:118:11
37: 0x55afd68ec198 - method_vectorcall
at /usr/local/src/conda/python-3.9.15/Objects/classobject.c:53:18
38: 0x55afd68cca70 - _PyObject_VectorcallTstate
at /usr/local/src/conda/python-3.9.15/Include/cpython/abstract.h:118:11
39: 0x55afd68cca70 - PyObject_Vectorcall
at /usr/local/src/conda/python-3.9.15/Include/cpython/abstract.h:127:12
40: 0x55afd68cca70 - call_function
at /usr/local/src/conda/python-3.9.15/Python/ceval.c:5077:13
41: 0x55afd68cca70 - _PyEval_EvalFrameDefault
at /usr/local/src/conda/python-3.9.15/Python/ceval.c:3537:19
42: 0x55afd68dd113 - _PyEval_EvalFrame
at /usr/local/src/conda/python-3.9.15/Include/internal/pycore_ceval.h:40:12
43: 0x55afd68dd113 - function_code_fastcall
at /usr/local/src/conda/python-3.9.15/Objects/call.c:330:24
44: 0x55afd68cbc4f - _PyObject_VectorcallTstate
at /usr/local/src/conda/python-3.9.15/Include/cpython/abstract.h:118:11
45: 0x55afd68cbc4f - PyObject_Vectorcall
at /usr/local/src/conda/python-3.9.15/Include/cpython/abstract.h:127:12
46: 0x55afd68cbc4f - call_function
at /usr/local/src/conda/python-3.9.15/Python/ceval.c:5077:13
47: 0x55afd68cbc4f - _PyEval_EvalFrameDefault
at /usr/local/src/conda/python-3.9.15/Python/ceval.c:3520:19
48: 0x55afd68ca8b7 - _PyEval_EvalFrame
at /usr/local/src/conda/python-3.9.15/Include/internal/pycore_ceval.h:40:12
49: 0x55afd68ca8b7 - _PyEval_EvalCode
at /usr/local/src/conda/python-3.9.15/Python/ceval.c:4329:14
50: 0x55afd68ca577 - _PyEval_EvalCodeWithName
at /usr/local/src/conda/python-3.9.15/Python/ceval.c:4361:12
51: 0x55afd68ca529 - PyEval_EvalCodeEx
at /usr/local/src/conda/python-3.9.15/Python/ceval.c:4377:12
52: 0x55afd6985cdb - PyEval_EvalCode
at /usr/local/src/conda/python-3.9.15/Python/ceval.c:828:12
53: 0x55afd69b4229 - run_eval_code_obj
at /usr/local/src/conda/python-3.9.15/Python/pythonrun.c:1221:9
54: 0x55afd69b03c4 - run_mod
at /usr/local/src/conda/python-3.9.15/Python/pythonrun.c:1242:19
55: 0x55afd6835673 - pyrun_file
at /usr/local/src/conda/python-3.9.15/Python/pythonrun.c:1140:15
56: 0x55afd69a9f02 - pyrun_simple_file
at /usr/local/src/conda/python-3.9.15/Python/pythonrun.c:450:13
57: 0x55afd69a9f02 - PyRun_SimpleFileExFlags
at /usr/local/src/conda/python-3.9.15/Python/pythonrun.c:483:15
58: 0x55afd69a7263 - pymain_run_file
at /usr/local/src/conda/python-3.9.15/Modules/main.c:377:15
59: 0x55afd69a7263 - pymain_run_python
at /usr/local/src/conda/python-3.9.15/Modules/main.c:602:21
60: 0x55afd69a7263 - Py_RunMain
at /usr/local/src/conda/python-3.9.15/Modules/main.c:681:5
61: 0x55afd6979979 - Py_BytesMain
at /usr/local/src/conda/python-3.9.15/Modules/main.c:1101:12
62: 0x7f08f2bde13a - __libc_start_main
63: 0x55afd6979881 - <unknown>
Traceback (most recent call last):
File "/home/ec2-user/mambaforge/envs/myenv/bin/tokenise_bio", line 11, in <module>
sys.exit(main())
File "/home/ec2-user/mambaforge/envs/myenv/lib/python3.9/site-packages/genomenlp/tokenise_bio.py", line 62, in main
tokeniser.train_from_iterator(
File "/home/ec2-user/mambaforge/envs/myenv/lib/python3.9/site-packages/tokenizers/implementations/sentencepiece_unigram.py", line 142, in train_from_ite
rator
self._tokenizer.train_from_iterator(
pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value: Internal
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment