Skip to content

Instantly share code, notes, and snippets.

@epogrebnyak
Last active November 7, 2020 09:20
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save epogrebnyak/9af08381b4d3e2a19492dbb5251e8736 to your computer and use it in GitHub Desktop.
Save epogrebnyak/9af08381b4d3e2a19492dbb5251e8736 to your computer and use it in GitHub Desktop.
Wht is in PyData 2020?
https://global.pydata.org/
1. Технический трек - от простого к сложному и качественному
----------------------------------------------------------
Датафреймы (базовая табличная структура данных):
An introduction to DataFrames.jl for pandas users, by Bogumił Kamiński
Skinny Pandas Riding on a Rocket, by Ian Ozsvald (PyDataLondon) - путанное название
Параллелизм и ускорение вычислений:
Parallel processing in Python: The current landscape, by Aaron Richter
Speed Up Your Data Processing: Parallel and Asynchronous Programming in Data Science, by Chin Hwee Ong
Hosting Dask: Challenges and Opportunities, by Matthew Rocklin
Supercharge Scientific Computing in Python with Numba, by Ankit Mahato
Дашборды:
Quickly deploying explainable AI dashboards, by Oege Dijk
Streamlit: The Fastest Way to build Data Apps, by Steven Kolawole
Scalable cross-filtering dashboards with Panel, HoloViews and hvPlot, by Philipp Rudiger and James A. Bednar
"Запаковать в продакшн" (как прототип модели становится работающим блоком в ИТ-системе):
DevOps for science: using continuous integration for rigorous and reproducible analysis, by Elle O'Brien
How to guarantee your machine learning model will fail on first contact with the real world., by Jesper Dramsch
Growing Machine Learning Platforms in the Enterprise, by Hussain Sultan
Transformation from Research Oriented Code into Machine Learning APIs with Python, by Tetsuya Jesse Hirata
Monitoring machine learning models in production, by Arnaud Van Looveren
Meditations on First Deployment: A Practical Guide to Responsible Data Science & Engineering, by Alejandro Saucedo
Feature drift monitoring as a service for machine learning models at scale, by Keira Zhou and Noriaki Tatsumi
Качество кода ("саентисты" хронически пишут посредственный код)
Rethinking Software Testing for Data Science, by Eduardo Blancas
Better Code for Data Science, by Alexander CS Hendorf
How to review a model, by Andy R. Terrel
Separation of ~concerns~ scales in software, by Thomas A Caswell
2. А хорошо ли это все работает и что в результате дает
------------------------------------------------------
Этика, explainability и справедливость моделей:
Tangible Steps Towards Algorithmic Accountability, by Ayodele Odubela (keynote)
Building fairer models for finance, by Andrew Weeks
Responsible ML in Production, by Catherine Nelson and Hannes Hapke
Open Source Fairness, by Aileen Nielsen
Opening the Black Box, by Ben Fowler and Chelsey Kate Meise
Safe, Fair and Ethical AI - A Practical Framework, by Tariq Rashid
"Small" data:
The Big Benefits of Small Data, by Christopher Lozinski
Taking a Close Look in the Mirror: Data Literacy for Data Experts, by Laura J Ludwig
Data processing pipelines for Small Big Data, by Esteban J. G. Gabancho and Anthony Franklin, PhD
Dirty Data science: machine-learning on non-curated data, by Gaël Varoquaux
Наука и код:
Is Coding Science? An interview with Wolfgang Kerzendorf, by Wolfgang Kerzendorf (keynote)
Computational Social Science with Python, and how Open Source transforms Academia and Research, by Bhargav Srinivasa Desikan
3. Статистические и математические методы
-----------------------------------------
Отдельные методы - статистика:
Bayesian Decision Science: A framework for making data informed decisions under uncertainty, by Ravin Kumar
When features go missing, Bayes’ comes to the rescue, by Narendra Mukherjee
Modelling the extreme using quantile regression, by Massimiliano Ungheretti
Geometric and statistical methods in systems biology: the case of metabolic networks, by Haris Zafeiropoulos and Apostolos Chalkis
Accelerating Differential Equations in R and Python using Julia's SciML Ecosystem, by Chris Rackauckas
Отдельные методы - временные ряды:
ML-Based Time Series Regression: 10 concepts we learned from Demand Forecasting, by Felix Wick
Modern Time Series Analysis with STUMPY, by Sean Law
TimeSeries Forecasting with ML Algorithms and there comparisons, by Sonam Pankaj
Применили методы:
Multi-Label Classification with Human Rights Data, by Megan Price, PhD and Maria Gargiulo (keynote)
Climate Change: analyzing remote sensing data with Python, by Luis Lopez
Leveraging python and open-source for data-science on the buy-side., by James Munro
Games, Algorithms, and Social Good, by Manojit Nandi
4. Нейросети и навороченный ML
------------------------------
Building Large-Scale Multilingual Fuzzy Matching Framework, by Abdulrahman Althobaiti
Inventing Curriculum using Python and spaCy, by Gajendra Deshpande
Is a neural network better than Ash at detecting Team Rocket? If so, how?, by Juan De Dios Santos
Snap ML: Accelerated, Accurate, Efficient Machine Learning, by Haris Pozidis and Thomas Parnell
Taking Care of Parameters So You Don’t Have to with ParamTools, by Hank Doupe
Thrifty Machine Learning, by Rebecca Bilbro
Using EOLearn to build a machine learning pipeline to detect plastics in the ocean., by Stuart Lynn
Why I didn’t use deep learning for my image recognition problem, by Liucija Latanauskaite
Cardinal: A metrics based Active Learning framework, by Alexandre Abraham
Complex Network Analysis with NetworkX, by K. Jarrod Millman
Ensemble-X: Your personal strataGEM to build Ensembled Deep Learning Models for Medical Imaging, by Dipam Paul and Alankrita Tewari
Ordinary viDeogame Equations: Winning games with PyMC3, sundials and numba, by Adrian Seyboldt
Visions: An Open-Source Library for Semantic Data, by Ian Eaves and Simon Brugman
Visual data: abundant, relevant, labelled, cheap. Pick two?, by Irina Vidal Migallon
What Lies in Word Embeddings, by Vincent D. Warmerdam
Autonomous Vehicles See More With Thermal Imaging: Multi-modal thin cross section Object Detection, by Laisha Wadhwa
Basic Pitfalls in Waveform Analysis, by Yukio Okuda
Building one (multi-task) model to rule them all!, by Nicole Carlson and Michael Sugimura
Entity matching at scale, by Lorraine D'almeida
Uncertainty Quantification in Neural Networks with Keras, by Matias Valdenegro-Toro
Using Algorithm X to re-analyse the last UK general election, by Alex Glaser
FlyBrainLab: An Interactive Open Computing Platform for Exploring the Drosophila Brain, by Mehmet Kerem Turkcan, Aurel A. Lazar and Yiyin Zhou
5. Разное
---------
Ковид
COVID-19 Visualizations, the Good, the Bad and the Malicious, by Rongpeng Li
What cyber security can teach us about COVID-19 testing, by Hagit Grushka - Cohen
Разные "ништяки" вокруг pandas
ipywidgets for Education! Using Jupyter tools to make Math Visualization applets for the classroom, by Chiin-Rui Tan
pandas.(to/from)_sql is simple but not fast, by Uwe Korn
What's new in pandas?, by Joris Van den Bossche and Tom Augspurger
pyodide: scientific Python compiled to WebAssembly, by Roman Yurchak
6. Туториалы и короткие выступления (некоторые):
------------------------------------------------
A Gentle Introduction to Multi-Objective Optimisation, by Eyal Kazin
Exploratory Data Analysis with Pandas and Matplotlib, by Allen Downey
Creating a data-driven culture: a social perspective, by Jordi Contestí
Data Visualization & Storytelling, by Jose Berengueres
Learning from your (model’s) mistakes, by Simona Maggio
Rapidly emulating professional visualizations from New York Times in Python using Altair, by Shantam Raj
Ten Ways to Fizz Buzz, by Joel Grus
Turn your notebook into a LaTeX-article with TexBook, by Valerio Maggio
UBI Center: A think tank built on GitHub, Python, and Jupyter, by Max Ghenis
nbreproduce: Jupyter notebooks in reproducible environments, by Mridul Seth
Building a Successful Data Science Team, by Justin J. Nguyen (основная программа)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment