-
Notifications
You must be signed in to change notification settings - Fork 0
hmakkapati/strainer
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Strainer is a HipChat service that identifies mentions, emoticons and
URLs in chat messages.
Installation
============
Strainer API requires the following python packages:
* eventlet
* flask
* lxml
* beautifulsoup4
* requests
* nose
* requests-mock
To install Strainer in a virtualenv:
* Create virtual environment
``virtualenv strainer-venv``
* Activate virtual environment
``source strainer-venv/bin/activate``
* Install dependencies
``pip install eventlet Flask lxml beautifulsoup4 requests nose requests-mock``
Running Tests
=============
To run tests, execute the ``run_tests.sh`` script in the top-level directory:
``./run_tests.sh``
Running Server
==============
To run the server, execute ``server.py`` script in ``strainer`` package.
``./strainer/server.py``
Notes
=====
# Improve tests
* There can be more tests around timing the timeouts while fetching titles.
* There can be a lot more tests around URL straining testing for various
kinds of valid and invalid URLs
* Similarly, testing for title fetching can be improved for a diverse set
of valid and invalid URLs
* There can be more tests around unicode characters
* Functional test coverage may be improved as well. In the interest of
time, since there is decent unit test coverage for straining and
fetching, some functional tests that could potentially be testing the
same thing are skipped here.
* Documentation in tests can be improved. However, the test names should be
self-explanatory
# This currently developed and tested on Python 2. May need more effort to make
it Python 3 compatible.
# If there's no limit on how long a chat message can be, it could be possible
the API may time out. Is there scope for parallelizing message straining?
# Currently, there are three different regular expressions. One regular
expression each for straining emoticons, mentions and URLs specifically. Is
there a way to devise one regular expression that does it all? If so, is
that going to efficient than the current three?
# Logging could be improved further. Generating and logging request ids could
prove very useful while troubleshooting
# The use of 'POST' vs 'GET' maybe debated. With GET, the input chat message
may have to be sent as a body assuming the message can be of any length.
Sending body with 'GET' is much debated and usually discouraged as there no
standards defined around processing request body with 'GET'. Hence, the use
of 'POST' here. Usage of 'POST' could be debated too from a strict RESTful
sense as the Strainer API doesn't create any resources. 'POST' just happens
to a better option for this scenario.
# There is probably scope for caching the titles if some URLs appear more
often than others in chat messages. If the URLs are fairly random, then
adding cache may not be a big gain. This needs to be studied further.
About
Identify emoticons, mentions and URLs from HipChat messages
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published