GSoC 2013 Application Bi Ge: Automating Release Process and Sympy bot

#Contact information

Name: Bi Ge

email: [email protected] (backup: [email protected])

github username: bgee

#Short Bio and Background:

I'm a third year Computer Engineering student at Georgia Institute of Technology.

I mainly program in Emacs running on a Macbook Pro and have an old Gentoo linux box mainly running git for homework code hosting. I like Emacs for its great portability because I need to constantly work across different platforms and grabbing the dot file from github can really keep me in my comfort zone. Another reason is when I just started learning programming, I was first introduced to Emacs and stick with it ever since.

I'm comfortable with C, PHP, C++, Matlab, Python, MIPS, ARM assembly and Mysql (ranked with familiarity). I've used both git and svn intensively for school assignments. Besides all the programming classes, this semester I am involved in e-stadium team in virtual interpreted projects at Georgia Tech where my team develops Web applications to provide various services to the football fans in the stadium during games (such as instant replay). I learned Python about one year ago and it now is my favorite language for its elegance and readability. My favorite feature in Python would be List comprehensions, take an example from python.org: squares = [x**2 for x in range(10)]

I don't plan to travel in summer so will be able to spend 40 hours per week on the project.

Merged Pull Requests:

PR 1869 fixed KroneckerDelta not being canonical.

PR 2056 fixed test_import directory dependency issue.

WIP:

PR 1924 Apply .doit() to simplify.py

PR 164 Add import_time test to sympy-bot

#Goal The goal of my GSOC is to automate the release process and enhance sympy-bot. As Aaron stated in the mailinglist: "In almost two year's time, we've had three releases, and are struggling to get out a fourth." Even though we should always be careful about releasing, having a rather complicated release process may eventually postpone pushing new features to user. We should take the advantage of the fact that sympy has an active developing community so we can release more often while maintaining a high level of quality. Frequent release may also solve the problem of holding back a release for a specific feature because we can just add it to the next release. There are procedures in the release process requiring some human thoughts which I will elaborate in the detail later.

Another part of this project is to improve Sympy-bot. Despite of Travis CI, sympy-bot still has its advantage. First, Sympy-bot allows us to run the tests under a specific setup that is not present in Travis. Second, with -t option, we can only run the tests we need. Right now the main issue with the bot is that it requires much human effort, like the releasing process. For example, to review all the pull requests, we need to sympy-bot list and manually enter all the numbers to sympy-bot review. By the end of the project, Sympy-bot should be just like Travis that it will run tests on all un-checked commits, post the results to github and continue looping until someone terminates it. Other possible things can be done to Sympy-bot include writing a really simple web front end so that if we have the bot running on a old laptop under the desk, we don't need to ssh into it just to check whether it has crashed; we can just go to 192.168.1.2:/sympy-bot and it will display the current status of the bot (how long it has been running, how many tests has passed etc).

To summarize, the purpose of this project is to automate the release as much as possible and improve the usability of sympy-bot to better integrate with the release process.

#Implementation details

manual flow chart

The graph above is a flow chart of the current release. Each step must be done by hand except for the first one. I outline all the procedures we need to do and possible solution to automation. ##Release Process

###Branching and release candidate

As far as creating new branch and pull request go, everything should be straightforward using github authenticated(oauth) api. We still need to be careful when we merge the branch because the version number will be changed by the time we commit. There are two ways to solve this problem. First is to make the bin/release clever enough to automatically finds out about the change in version number and will carefully merge the branch. However, this is tricky and can cause problems if not implemented well. So another way to tackle this is make the merge part manual.

After we create the pull request, it is time to fire up sympy-bot (or sympy-bot will automatically finds out about this, once we automate the whole review cycle of the bot).
###Check all tests are included

Right now the way we are doing it is run python bin/generate_test_list.py and compare it against a list in setup.py by hand. This can be easily done by saving the output from generate_test_list.py to a string and check it with the list in setup.py. It would be something like

test_list = subprocess.check_output("python bin/generate_test_list.py", shell=True)
if tests != test_list:
    print "not all tests are include"
    exit(1)

###Check all modules are included

Like how we check tests, the old way of doing this is to run
```
 for i in `find sympy -name __init__.py | rev | cut -f 2- -d '/' | rev | egrep -v "^sympy$" | egrep -v "tests$" `; do echo "'${i//\//.}',"; done | sort
```
and compare it against a list in the setup.py. But we can use the method like previous one to automate this check.
###Tox

Automating the tox part is pretty straight-forward as long as the tox.ini file is correctly set up.
###Check import speed

This can be easily done both with and without a real person. However, because everyone is using a different machine, it makes sense to let people take a look at this and choose whether or not to proceed with the release process.
###Run tests in isolated environment

After we run

$ bin/test_isolated Generating py.test isolated testsuite... Done. Run '/tmp/test_sympy.sh'. ``` The $ /tmp/test_sympy.sh | less can be called with `subprocess` to make sure all tests pass.

###Slow tests

Again, using subprocess and check return value should be sufficient for automation.
###Make sure all examples work

Simply run from subprocess and check if string NO FAILED EXAMPLES appears at the end.
###Check pyglet

After running the commands, a real person must check the plots open correctly all three times.
###Check numpy, scipy, sage, fortran

This could be implemented by using subprocess and check the return code.

try: return = subprocess.call("sage -python bin/test sympy/external/tests/test_sage.py") except CalledProcessError: print "external tests failed" ```

###Check docs

bin/release.py will change the version number first before running

cd doc make clean make html make htmli18n ``` to ensure all the docs contain the correct version number and the maintainer must open _build/html/index.html as well as the pdf in his/her browser to inspect everything works.

###Make tarball and installers, update authors and websites

Making tarball and installers shouldn't be too hard while updating authors and websites(mostly upload files to various sites) may require a case-by-case approach and require a bit work from the maintainer.
###Write release notes

It would be handy to have a list of changes since last release. Search through github commit/merge messages should be enough since if we are releasing more ofter, there shouldn't be too much change from previous version to this one.
###Do clean-ups with the branch

##Sympy-bot

The main problems with sympy-bot are

Right now, if we want to test all the new committed push requests on github, we have to first do sympy-bot list, then sympy-bot review 2021 2022 2023 etc.. We would like to make sympy-bot automative. So that one can just have sympy-bot running and it will constantly find all un-tested commits and test them. For example, we can just do ./sympy-bot review --loop.
Another feature I would like to add to sympy-bot is benchmarking. We would really love to know which part of sympy is slow and how each patch affects the speed. The goal is to have a sophisticated benchmark suite that will be executed by sympy-bot.

###Make sympy-bot automatically checks for new/untested commit This could be done by constantly looping though the pull request list, check if there are any new commits, test them and start looping.
###Add stress test mode to Sympy-bot Stress test will let Sympy-bot run slower tests that takes longer to finish. If stress test turned on, a more thorough test will be performed on Sympy-bot.
###Benchmarking
- Sympy used to have a benchmark system in /utilities/benchmark.py. The directory is still there but it does not cover all the aspects we want to test in sympy. What we need to do is to add boarder coverage of the tests and have sympy-bot run the benchmark and leave a comment in the pull request page. Such as ./sympy-bot review 2000 --profile benchmark (or this is default in the loop mentioned above) and whoever submits that pull request will get information about the overall performance.
- Further more, adding detail line-profiling feature can be helpful when people are trying to figure out which part of the code is slowing down the system. We can mimic the line_profiler library to produce a output similar to the following: ./sympy-bot review 2000 --benchmark pystone.py

Pystone(1.1) time for 50000 passes = 2.48 This machine benchmarks at 20161.3 pystones/second Wrote profile results to pystone.py.lprof Timer unit: 1e-06 s File: pystone.py Function: Proc2 at line 149 Total time: 0.606656 s Line # Hits Time Per Hit % Time Line Contents

149 @profile 150 def Proc2(IntParIO): 151 50000 82003 1.6 13.5 IntLoc = IntParIO + 10 152 50000 63162 1.3 10.4 while 1: 153 50000 69065 1.4 11.4 if Char1Glob == 'A': 154 50000 66354 1.3 10.9 IntLoc = IntLoc - 1 155 50000 67263 1.3 11.1 IntParIO = IntLoc - IntGlob 156 50000 65494 1.3 10.8 EnumLoc = Ident1 157 50000 68001 1.4 11.2 if EnumLoc == Ident1: 158 50000 63739 1.3 10.5 break 159 50000 61575 1.2 10.1 return IntParIO

    + Another point is to advertise benchmarking heavily in the community. Since whoever writes the new feature probably understands it best, it would be natural for the developer to write benchmarking for his/her own work and get feedback from the benchmarking while testing.  

* ###Authenticated API requests (if time permits)
    Actually there is a pull request for this [(link)](https://github.com/sympy/sympy-bot/pull/153). We should get this done if by the time we complete working on release process this is still not merged.


* ###Add color code(html) to sympy-bot reports (if time permits)
    This requires parse terminal colors into html tags before the result being posted instead of parsing the color at the report side.

* ###Multiprocessing tests (if time permits)

    We have quite a lot of tests to do on different interpreters. It would be nice to run multiple tests on a multiple core machine to reduce the time. 

#Timeline

* ###Week 1-2  Make `bin/release.py` able to check all tests and modules are included
    
* ###Week 3-4  Automate creating new branch, import test

* ###Week 5-6  Automate tarball process and generating hints for release notes

* ###Week 7-8  Update authors and upload new version to various website

* ###Week 9-10 Stress test, clean up old benchmarking code and have Sympy-bot displaying statistic in the comment 

* ###Week 10-12 Apply looping to Sympy-bot, add line-profiling feature

* ###Week 13-  Improve docs, if time permits continue to work on fixes to Sympy-bot
        
#Reference
  * Discussion in the mailing list: https://groups.google.com/forum/#!topic/sympy/UfNhyFv-oMg/discussion  
  * Another discussion about release process in Google code https://code.google.com/p/sympy/issues/detail?id=3445  
  * Older discussion on automating release process in mailing list: https://groups.google.com/forum/#!topic/sympy/UfNhyFv-oMg/discussion  
  * Github wiki on automating test: https://github.com/sympy/sympy/wiki/Test-automation  
  * Github wiki on release proces: https://github.com/sympy/sympy/wiki/New-Release

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GSoC 2013 Application Bi Ge: Automating Release Process and Sympy bot

Pystone(1.1) time for 50000 passes = 2.48 This machine benchmarks at 20161.3 pystones/second Wrote profile results to pystone.py.lprof Timer unit: 1e-06 s File: pystone.py Function: Proc2 at line 149 Total time: 0.606656 s Line # Hits Time Per Hit % Time Line Contents

Clone this wiki locally