This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# coding: utf-8 | |
""" | |
Тренер классификатора интентов для чатбота - нейросетка поверх ELMO. | |
05.09.2019 первая реализация, за основу взят код train_intent_classifier_bert.py | |
""" | |
from __future__ import print_function | |
import numpy as np | |
import argparse | |
import platform |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# coding: utf-8 | |
""" | |
Тренер классификатора интентов для чатбота - нейросетка поверх BERT. | |
13.07.2019 первая реализация | |
13.07.2019 сделан gridsearch для подбора параметров сетки | |
20.07.2019 переделка для прямого использования nlu.md | |
26.07.2019 в кач-ве метрики кроссвалидации используется f1_weighted | |
""" | |
from __future__ import print_function |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# -*- coding: utf-8 -*- | |
''' | |
Тренировка модели, которая посимвольно в режиме teacher forcing учится генерировать | |
ответ для заданной предпосылки и вопроса. | |
В качестве классификационного движка для выбора символов используется нейросетка | |
За один запуск модели выбирается один новый символ, который добавляется к ранее сгенерированной | |
цепочке символов ответа (см. функцию generate_answer). Генерация через повторные запуски продолжается | |
до появления специального маркера конца цепочки END_CHAR. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import io | |
import collections | |
import re | |
import ruword2tags | |
regex1 = re.compile(u'[%s ]+' % re.escape(u'"«».,:;!?=()\t\u00a0\u202F\u2060\u200A\s')) | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# -*- coding: utf-8 -*- | |
""" | |
Подбор сырья для формирования датасета для тренировки валидатора синтаксиса. | |
Берем фразы с правильным синтаксисом и заменяем в них предлоги на рандомные, | |
при необходимости пересогласуя подчиненные существительные и прилагательные. | |
""" | |
from __future__ import division # for python2 compatibility | |
from __future__ import print_function |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# -*- coding: utf-8 -*- | |
""" | |
Решение задачи линейно регрессии по МНК с помощью Keras. | |
""" | |
from __future__ import print_function | |
import random | |
import numpy as np |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# -*- coding: utf-8 -*- | |
""" | |
Использование автоматического дифференцирования autograd (https://github.com/HIPS/autograd) | |
для решения линейной регрессии МНК. | |
""" | |
from __future__ import print_function | |
import autograd.numpy as np |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# -*- coding: utf-8 -*- | |
""" | |
Использование автоматического дифференцирования autograd (https://github.com/HIPS/autograd) | |
для решения линейной регрессии МНК. | |
Код может решать только задачу линейно регрессии, так как | |
в нем отдельно выписывается градиентный спуск по каждому из двух | |
компонентов решения через частные производные. | |
""" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
using System; | |
using System.Collections.Generic; | |
using System.Linq; | |
using System.Text; | |
using System.Threading.Tasks; | |
using NNSharp.DataTypes; | |
namespace sample1 | |
{ | |
class Program |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
using System; | |
using System.Collections.Generic; | |
using System.Linq; | |
using System.Text; | |
using System.Threading.Tasks; | |
using NNSharp.DataTypes; | |
namespace sample1 | |
{ | |
class Program |