Starred repositories
A C++ framework for testing stock market trading systems using data from google finance
通过celery定期执行更相关任务,将万得wind,同花顺ifind,东方财富choice、Tushrae、JQDataSDK、pytdx、CMC等数据终端的数据进行整合,清洗,一致化,供其他系统数据分析使用
Develop a customer segmentation to define marketing strategy. Used PCA to reduce dimensions of the dataset and KMeans++ clustering technique is used for clustering and profiling of clusters.
2021年【思维导图】盒子,C/C++,Golang,Linux,云原生,数据库,DPDK,音视频开发,TCP/IP,数据结构,计算机原理等
open-source feature selection repository in python
Adding feature_importances_ property to sklearn.cluster.KMeans class
UnSupervised and Semi-Supervise Anomaly Detection / IsolationForest / KernelPCA Detection / ADOA / etc.
Pyspark RDD, DataFrame and Dataset Examples in Python language
Implementations of label propagation like algorithms
A NetworkX implementation of Label Propagation from a "Near Linear Time Algorithm to Detect Community Structures in Large-Scale Networks" (Physical Review E 2008).
Python library for converting Scikit-Learn pipelines to PMML
此项目是机器学习(Machine Learning)、深度学习(Deep Learning)、NLP面试中常考到的知识点和代码实现,也是作为一个算法工程师必会的理论基础知识。
Implementations of the machine learning algorithm with Python and numpy
Implementing best practices for PySpark ETL jobs and applications.
刷算法全靠套路,认准 labuladong 就够了!English version supported! Crack LeetCode, not only how, but also why.
用机器学习建立贷款用户风控模型
Tools for WoE Transformation mostly used in ScoreCard Model for credit rating
Python Jupyter Notebook Application that classifies bank transactions as fraud and normal. Dataset taken from https://www.kaggle.com/dalpozz/creditcardfraud/data. Own Neural Network implementation.
Kaggle Fraud Transaction Detection competition
5th Place Solution for WoPlus Phone Changing Prediction. 2016年上海联通数据开放竞赛 - 用户换机预测