• Data Scientist

  • Photography

  • Music

  • Book

  • Hogwarts

  • Business

  • Contact

  • More

    Sanhe Hu

    Welcome to my world :)

    • octocat-white
    • White Instagram Icon

    Open Source Projects

    Thursday, September 07, 2017

    pygitrepo

    pygitrepo provides python developer a super easy way to initiate a fancy python project in 1 seconds!

    If you take looks at top open source python project, such as requests, scrapy, etc, ... you may found something in common:

    ​

    1. full documents.

    2. easy to test.

    3. continues integration test and high code coverage.

    4. compatible with Python2.7 and Python3.X

    ​

    pygitrepo gives you all these out of the box, and keeps all your python project very professional as requests

    ​

    Project Homepage

    Monday, May 01, 2017

    pytq (Python Task Queue)

    Pytq is a Task Queue framework library. Provides easy integration with log and database system. It is highly extensible to any data persistent layer and worker logic, 

    This project originally designed to improve code quality for web crawler project.

    Project Homepage

    Thursday, November 02, 2017

    compress lib

    There’s lots of mature data compression algorithm you can choose from, compress provides normalized API to use them and switch between them in Python.

    ​

    • zlib.

    • bz2.

    • lzma, high compression ratio but slow (it’s part of standard library after Python3.4, you can use backports.lzma for versions before that.)

    • pylzma, another implementation, faster in decompression than lzma.

    • snappy, from Google, lower compression ratio but super fast!

    • lz4, lower ratio, super fast!

    Project Homepage

    Friday, August 12, 2016

    Superjson

    Superjson provides support for date, datetime, set, OrderedDict, deque, numpy.ndarray, that the original json module can not serialize, and easy to extend to support and custom data type.

    ​

    More awesome features that the built-in json module doesn't have ...

    Project Homepage

    Thursday, November 12, 2015

    constant2

    constant2 is the python package that organize lots of constant variable in variety of  ways, such as group, nest. And provides rich API to manipulate values. It helps you write concise and correct code in big python software.

    Project Homepage

    Sunday, November 01, 2015

    pyknackhq

    Knackhq is an awesome developer tools. It allows developer to build a user friendly, database backed web application in few clicks. So the non-developer can enjoy the power and convenience of non-sql database with no Pain. However, the developer may still want to manipulate it via programming, that's how knackhq API for. Python is the easiest powerful general purpose programming language. Most of new technique put supporting Python API in their top 3 list. I don't know why there's no official Python API for this great app - Knackhq.

     

    Good news, my pyknackhq project have been selected by knackhq Inc as their official Python API.

    Project Homepage

    Thursday, January 15, 2015

    sqlite4dummy

    sqlite4dummy is a high performance and object oriented sqlite API for Data Scientist.

     

    Features:

     

    1. Faster than the Top Database Project in Python Community - sqlalchemy.

    2. Human language like syntax, minimal code is needed.

    3. A lots of vanilla method are provided for frequent-used work and data manipulation.

    Project Homepage

    Saturday, November 01, 2014

    World-Cup Big Data Gambling

    Who has the most information to predict a result of a soccer match? Gambling company. The idea is really simple, if we know how much money bet on Win/Lose/Draw, then the worst case for Gambling company would never happened. The sentimental of tweets on tweeter and facebook reveals everything. To avoid my source code been abused, the project is now private.

    Sunday, September 01, 2013

    royalflush

    In Texas Poker game, there are billions of uncertainties. royalflush is a pure python library, which is able to calculate the probability you can win in seconds. And the calculation engine can be on cloud. Several search engine technique such as bitmap hash, shard index been used for optimal performance. WARNING: DO NOT use this in real gambling.

    Project Homepage

    Wednesday, October 07, 2015

    uszipcode

    What's this item abuszipcode is the most powerful and easy to use zipcode information searchengine in Python. Besides geometry data (also boundary info), several useful census data points are also served: population,population density, total wage, average annual wage, house of units, land area, water area. The geometry and geocoding data I am using is from google map API on Oct 2015.

    Project Homepage
    Please reload