11个实用但你可能不知道的Python程序库

python| 2015-01-29

目前，网上已有成千上万个Python包，但几乎没有人能够全部知道它们。单单PyPi上就有超过47000个包列表。

现在，越来越多的数据科学家开始使用Python，虽然他们从pandas，scikit-learn，numpy中获得了不少好处，但我仍想向他们介绍一些年长且非常实用的Python库。在本文中，我将列一些不太知名的库，即使你是经验丰富的Python的开发者，也值得过来一看。

1) delorean

Dolorean是一个非常酷的日期/时间库。类似JavaScript的moment，拥有非常完善的技术文档。

from delorean import Delorean
EST = "US/Eastern"
d = Delorean(timezone=EST)

2) prettytable

你可能从未听过该库，因为它托管在GoogleCode。prettytable主要用于在终端或浏览器端构建很好的输出。

from prettytable import PrettyTable
table = PrettyTable(["animal", "ferocity"])
table.add_row(["wolverine", 100])
table.add_row(["grizzly", 87])
table.add_row(["Rabbit of Caerbannog", 110])
table.add_row(["cat", -1])
table.add_row(["platypus", 23])
table.add_row(["dolphin", 63])
table.add_row(["albatross", 44])
table.sort_key("ferocity")
table.reversesort = True
+----------------------+----------+
|        animal        | ferocity |
+----------------------+----------+
| Rabbit of Caerbannog |   110    |
|      wolverine       |   100    |
|       grizzly        |    87    |
|       dolphin        |    63    |
|      albatross       |    44    |
|       platypus       |    23    |
|         cat          |    -1    |
+----------------------+----------+

3.snowballstemmer

好吧，我也是首次安装该库。这是一款非常瘦小的语言转换库，支持15种语言。

from snowballstemmer import EnglishStemmer, SpanishStemmer
EnglishStemmer().stemWord("Gregory")
# Gregori
SpanishStemmer().stemWord("amarillo")
# amarill

4.wget

你是否还记得，每一次都会因为某个目的而编写网络爬虫工具，以后再也不用了，因为wget就足够你使用了。wget是Python版的网络爬虫库，简单好用。

import wget
wget.download("https://www.cnn.com/")
# 100% [............................................................................] 280385 / 280385

备注：linux和osx用户这样用：from
sh import wget。但是，wget模块还有一个更好的argument handline。

5.PyMC

scikit-learn似乎是所有人的宠儿，但在我看来，PyMC更有魅力。PyMC主要用来做Bayesian分析。

from pymc.examples import disaster_model
from pymc import MCMC
M = MCMC(disaster_model)
M.sample(iter=10000, burn=1000, thin=10)
[-----------------100%-----------------] 10000 of 10000 complete in 1.4 sec

6.sh

sh库用来将shell命令作为函数导入到Python中。在bash中使用是非常实用的，但是在Python中不容易记住怎么使用（即递归搜索文件）。

from sh import find
find("/tmp")
/tmp/foo
/tmp/foo/file1.json
/tmp/foo/file2.json
/tmp/foo/file3.json
/tmp/foo/bar/file3.json

7.fuzzywuzzy

Fuzzywuzzy是一个可以对字符串进行模糊匹配的库，大家有空可以去查看源码。

from fuzzywuzzy import fuzz
fuzz.ratio("Hit me with your best shot", "Hit me with your pet shark")
# 85

8.progressbar

progressbar是一个进度条库，该库提供了一个文本模式的progressbar。

from progressbar import ProgressBar
import time
pbar = ProgressBar(maxval=10)
for i in range(1, 11):
    pbar.update(i)
    time.sleep(1)
pbar.finish()
# 60% |########################################################                                      |

9.colorama

colorama主要用来给文本添加各种颜色，并且非常简单易用。

10.uuid

uuid是基于Python实现的UUID库，它实现了UUID标注的1，3,4和5版本，在确保唯一性上真的非常方便。

import uuid
print uuid.uuid4()
# e7bafa3d-274e-4b0a-b9cc-d898957b4b61

11.bashplotlib

bashplotlib是一个绘图库，它允许你使用stdin绘制柱状图和散点图等。

$ pip install bashplotlib
$ scatter --file data/texas.txt --pch x

英文原文：11 Python Libraries You Might Not Know

本文文字及图片出自 CSDN

你也许感兴趣的：

共有 3 条讨论

jimmy说道：

2015-01-29

用python做数据分析必须要用到pandas/NumPy的啊：）欢迎加入pandas技术QQ群：297882961，一起学习交流

回复
啦啦啦这篇文章
2015-11-29 15:01
Jaya 这篇文章
2016-09-04 01:11