forked from douban/dpark
-
Notifications
You must be signed in to change notification settings - Fork 4
Python clone of Spark, a MapReduce alike framework in Python
License
congmo/dpark
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Dpark is a Python clone of Spark, MapReduce computing
framework supporting regression computation.
Word count example wc.py:
from dpark import DparkContext
ctx = DparkContext()
file = ctx.textFile("/tmp/words.txt")
words = file.flatMap(lambda x:x.split()).map(lambda x:(x,1))
wc = words.reduceByKey(lambda x,y:x+y).collectAsMap()
print wc
This scripts can run locally or on Mesos cluster without
any modification, just with different command arguments:
$ python wc.py
$ python wc.py -m process
$ python wc.py -m mesos
See examples/ for more examples.
Some Chinese docs: https://github.com/jackfengji/test_pro/wiki
About
Python clone of Spark, a MapReduce alike framework in Python
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published