Skip to content

Instantly share code, notes, and snippets.

View secsilm's full-sized avatar
🚴
Focusing

Alan Lee secsilm

🚴
Focusing
View GitHub Profile
@secsilm
secsilm / merge_mijia.py
Last active April 9, 2024 09:16
合并米家摄像头监控视频,生成以天为单位的视频文件。
import argparse
import subprocess
from pathlib import Path
from loguru import logger
parser = argparse.ArgumentParser(description='合并米家摄像头视频,以天为单位。')
parser.add_argument('indir', help='原米家摄像头视频目录。')
parser.add_argument('--outdir', default='./', help='合并后视频存放目录,目录不存在会被创建。默认当前目录。')
args = parser.parse_args()
@secsilm
secsilm / sklearn-tfidf.ipynb
Last active June 16, 2020 06:38
sklearn 如何计算 TFIDF
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@secsilm
secsilm / read_color.py
Last active August 1, 2019 06:02
Python 读取 excel 文件并保留格式
from openpyxl import load_workbook
def read_color(f):
wb = load_workbook(f)
ws = wb.active
for row in ws.iter_rows():
for cell in row:
print(f"cell value={cell.value}, cell color={cell.fill.start_color.index}")
@secsilm
secsilm / tensorflowhub_share_ppt.ipynb
Last active June 4, 2019 08:28
使用 TensorFlow Estimators 和 TensorFlow Hub 对酒店评论进行情绪分类
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@secsilm
secsilm / standardization-vs-normalization.ipynb
Created April 25, 2018 03:32
Standardization and Normalization in sklearn
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@secsilm
secsilm / unzip-jay.py
Created April 4, 2018 11:51
将存在于多个文件夹中的 zip 文件解压到另一个目录下的独立文件夹
import re
import shutil
import warnings
import zipfile
from pathlib import Path
# zip 文件所在的地址
in_path = Path('D:\BaiduYunDownload\Jay Chou')
# 解压地址
out_path = Path('D:\BaiduYunDownload')
@secsilm
secsilm / color_scatter.html
Created March 15, 2018 05:39
The generated html file by color scatter example
This file has been truncated, but you can view the full file.
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>color_scatter.py example</title>
<link rel="stylesheet" href="https://cdn.pydata.org/bokeh/release/bokeh-0.12.14.min.css" type="text/css" />
<script type="text/javascript" src="https://cdn.pydata.org/bokeh/release/bokeh-0.12.14.min.js"></script>
@secsilm
secsilm / cartopy-image.py
Last active September 4, 2017 13:20
Add image on the top of map using cartopy
'''
This code is a example for adding image on the top of map using cartopy.
The generated image can be found here: https://i.imgur.com/aTY1rYY.png
'''
import matplotlib.pyplot as plt
import cartopy.crs as crs
from matplotlib.offsetbox import AnnotationBbox, OffsetImage
from PIL import Image

使用集成学习提升机器学习算法性能

这篇文章是对 PythonWeekly 推荐的一篇讲集成模型的文章的翻译,原文为 Ensemble Learning to Improve Machine Learning Results,由 Vadim Smolyakov 于 2017 年 8 月 22 日发表在 Medium 上,Vadim Smolyakov 是一名 MIT 的研究生,对数据科学和机器学习充满热情。

集成学习(Ensemble Learning)通过联合几个模型来帮助提高机器学习结果。与单一模型相比,这种方法可以很好地提升模型的预测性能。这也是为什么集成模型在很多著名机器学习比赛中被优先使用的原因,例如 Netflix 比赛,KDD 2009 和 Kaggle。

集成方法是一种将几种机器学习技术组合成一个预测模型的元算法(meta-algorithm),以减小方差(bagging),偏差(boosting),或者改进预测(stacking)。

集成方法可以分为两类:

@secsilm
secsilm / simple-autoencoder.ipynb
Created July 25, 2017 11:52
simple wrong autoencoder
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.