Aggregated blog posts about Django, updated every hour

May 15, 2012

Circular import errors

From Reinout van Rees on May 15, 2012 12:56 AM

In rare cases, you can import a file that imports the file you're importing from. This might sound a bit recursive, and it is. In Python it is called a circular import error.

The best way to recognize a circular import that goes wrong is to look at your results. You will get an ImportError message from Django, like cannot import some_view from your_project.views. When you look in your_project/views.py you see that some_view really exists in that very same file. Huh? That wasn't expected.

This huh??? is exactly what a circular import looks like. Something that exists (you checked it five times at least) doesn't seem to exist. A computer doesn't lie, but you start to wonder.

The problem is, Python cannot complete the import, so it complains about the point where it goes wrong, even though the actual error is the circular import loop as a whole.

The solution is to break the loop somewhere. Perhaps the thing you want to import is best placed somewhere else? Most often, a circular import error indicates an organization problem in your code.

May 14, 2012

10 reasons to go to DjangoCon Europe

From Daniel Greenfeld on May 14, 2012 07:30 PM

You should go to DjangoCon Europe in lovely Zurich, Switzerland. Here are 10 reasons why:

1. Chocolate

So much of what we like about chocolate comes from Switzerland. For example, Milk Chocolate was invented in Switzerland.

2. Keynote speaker: Jacob Kaplan-Moss

Always a great speaker and fun to be around, he's one of the BDFL's of Django.

3. Cheese

I grew up thinking that Swiss Cheese was just about holes. It's so much more. I can't wait to try fresh European cheese made by master craftsmen from the freshest ingredients.

4. Keynote speaker: Jessica McKellar

In a word, Jessica is incredible. She's a Twisted core developer, PSF board member, part of the trio responsible for the gigantic Boston Python User Group's massive size explosion, and a talented speaker. She's used her incredible talents and skills to increase diversity in the community and generally help other people.

5. Breakfast

Muesli was invented in Switzerland. I love Muesli. I was floored by how much better it was in New Zealand. I can't wait to try it in it's homeland.

6. Web Site

The DjangoCon Europe site is crazy. I mean, look at all those animations!

7. Talks

This is a single track event with proven speakers like Zachary Voase and Andrew Godwin, yet balances that with bringing in new blood to spice things up. And dare I say I'm giving a technical talk with Audrey Roy? ;-)

8. Mountains

With all the incredible food, you would think you would gain umpteen kilograms. Fortunately there are mountains all around to climb and hike.

9. Sprints

Want to sprint on Django itself? Look no further because there will be Django core developers around! There will also be notable Python developers like Kenneth Reitz and others around working hard on a lot of different projects. It's going to intense and fun!

10. Castles

Living in the USA, we just don't have anything like castles. DjangoCon Europe will be near a small horde of stone fortifications. Which means if the Zombie Apocalypse happens during the conference, we'll have many secure places to go. They also make lovely tourist destinations. :-)

What are you waiting for?

DjangoCon Europe has a cap on attendance. Tickets for Python events have been selling out, not just for PyCon US. Don't miss out!

It's all about me

Yup.

Call me selfish but I want you there because I haven't haven't met all our European friends yet in person. Hope to see you next month in Zurich!

May 11, 2012

AOP in Python API design - Douwe van der Meij

From Reinout van Rees on May 11, 2012 12:37 PM

AOP is Aspect Oriented Programming. It is not often used in Python, you see it more often in for instance Java.

His program does calculations on biogas installations. And the calculation needs to be exposed through a web API (using Django).

He thought: what about aspects? For instance the security aspect? Statistics/logging aspect? The serialization aspect? All several aspects of the API. He showed a bit of sample code where all these aspects were all inside the same single Django view method. Looked like quite a mixed-up mess (as was his intention).

The resulting code is both scattered and tangled. Scattered because one aspect is scattered all over the place in your code. Tangled because different aspects are often interweaved/mixed.

You can use AOP to fix this. Now, how do we implement this in pure Python?. The number one candidate is decorators.

AOP deals with pointcuts, join points and advices. Advices can be pre, post and around ("a wrapper"). You can use three decorators for these three cases. In the end his example looked a bit like this:

@secure
@serialize
@statistics
@dispatch
def api_call(...):
    ...

AOP offers some brilliant concepts for software engineering. Separate your concerns and aspects; avoid tangling and scattering!

In response to a question: no, this is not a framework, these are just regular decorators. But he used the ideas of AOP to design his decorators.

Station Groningen

Django-crispy-forms - Miguel Araujo

From Reinout van Rees on May 11, 2012 11:55 AM

django-crispy-forms is a Django application, but Miguel thinks the talk will help you also with designing other systems and applications.

Django has three ways to render forms: as_ul, as_p, as_table. They do the same, but render themselves in a different way. Common questions by people new to Django is "what about divs?" and "how to reorder fields?". For the last one, you need to switch the order of the form fields in your python form code. There are some other tricks like overriding the self.fields.keyOrder attribute. If you have many fields, regular list methods like delete() and pop() and insert() might help you.

But... ModelForms for the admin interface are different again: the abovementioned form tricks don't work there. And those tricks sound a bit dirty anyway. So: how to customize the form output?

You can do a lot in by customizing the Django form in the template, but most of it will be hardcoded and hand-tweaked that way. And if you customize a form, you'll often forget form.errors and form.non_field_errors, for instance.

Django-crispy-forms was formerly known as django-uni-forms. It was created by pydanny in 2008, Miguel is now one of the main committers.

Crispy forms work on forms, modelforms and on formsets. A |crispy filter in the template renders your form as handy divs with better classes and IDs which helps a lot if you want to customize your form with css. Neat!

Crispy also has a {% crispy %} template tag. You can pass it a "form helper". A form helper is a global helper: it is decoupled from forms. So it normally works with any form. There are some attributes on the form helper that you can set, like method (get/post), form_id, things like that.

You can customize such a helper specifically for one form and set the order of the fields, for instance. There are so many things you might want to customize; crispy supports/allows most/all of them with a Layout class and other layout classes like Div. You can get really deep into the machinery by letting crispy inject Django template code directly into the template...

For the ultimate in customizability, you can write your own layout class that renders itself in whatever way you want. Layouts can be nested, so there is a lot of flexibility here. You can also customize crispy's own templates that it uses for fields and forms using the regular Django overwrite-a-template mechanism.

Handy: crispy forms has specific support for twitter bootstrap. This helps you get a nice looking form.

Station Groningen

Geodjango - Ivor Bosloper

From Reinout van Rees on May 11, 2012 11:00 AM

In the GIS world, everything used to be proprietary (ESRI, Oracle), but there is a lot of commoditization in the last years. Lots of open source. One of those open source pieces is geodjango.

Geodjango is bundled with Django, but out of the box you miss a couple of pieces. You need to install a couple of extra libraries (gdal, geos, proj) and you need a geospatial database (postgis, oracle, mysql, spatialite).

Ivor guided us through a sample application. Things like setting a gis database instead of a standard one. Adding django.contrib.gis to the INSTALLED_APPS setting. And special geometry fields for points, lines, polygons. Using a specific OSMGeoAdmin for showing a map in the admin interface for those geo fields.

A limitation in geodjango is that it doesn't give you regular form fields for the geo fields. They work in the admin, but not in regular forms. Luckily django-floppyforms does provide them, so he used floppyforms to get nice forms including a map in his regular web interface. Creating geojson from database content and show that in the map.

(Note to self: look at proj4js).

Geodjango is well-integrated into Django, but you do need to use the geodjango variants of fields, databases, admins. "You need to prefix the stuff". You get a lot out of the box, but there's quite a learning curve. You also need to learn quite some javascript for the user interface.

Photo & Video Sharing by SmugMug

I am doing HTTP wrong - Armin Ronacher

From Reinout van Rees on May 11, 2012 08:59 AM

According to Armin Ronacher most (Python) web frameworks use a request/response style of handling HTTP. At his company, they're treating HTTP a litle bit different. (So the talk is first about some HTTP-usage-in-Python observations and second a look at the alternative way they're treating HTTP).

Note: my brother has a clearer summary, btw.

The most low-level way is to write directly to the response. Write the response headers, write the actual response content. In Python, you often have some response object; often some sort of middleware gets the chance to do something to the response on the way out.

The nice things we like about HTTP:

  • It is text based. You can easily debug it.
  • REST is handy for APIs.
  • Content negotiation.
  • Caching.
  • Very very well supported :-)

A basic question you should ask yourself is why does my application look like HTTP? A common Django application gets a request, does something and returns a response. Works well. But why is it set up that way? Why is it so focused on HTTP? (It is logical that it focuses on this use case, but you can still ask the question).

HTTP can be a stream or buffered. Sending stuff from the server to your browser is a stream. But often an incoming request in a Python web framework is first buffered internally (memory or disk). In the same way a request is a bit of a strange mix:

  • request.headers: buffered
  • request.form
  • request.files: buffered to disk
  • request.body: streamed!

On the client (like your webbrowser) you cannot do anything to an incoming request, once it started, is to close the connection. You cannot interact anymore once you received your first incoming byte.

A consequence of the buffering and the way HTTP is handled is that you can have problems accepting data. How big a file should you accept? How big an incoming form? Buffer it in memory? Or on disk? And how do you handle streaming? You might be streaming in one part of your code, but how do the other layers handle it?

Internally in his company, he's trying to handle HTTP differently. There's no direct HTTP contact in most of the code base. Everything that eventually ends up in the HTTP layer is implemented as some sort of "type object". This allowed them to really flexible in the HTTP layer. Support for different input/output format. Easier to test. Documentation can be auto-generated. Lots of common errors can be catched early.

A basic rule is to be strict in what you send, but generous in what you receive. But web Python code is often generous by "just" accepting a lot without much checking. That might be a security risk. In Armin's system, you know what type should be coming in, so you can do proper checking.

How does this deal with the big-upload problem? Incoming streaming data? Well, because of the type system, you actually know which types need a streaming API. This makes it easy to set up your API correctly. You can even selectively use a different protocol than HTTP.

CSS Hacking to make my code samples legible

From Daniel Greenfeld on May 11, 2012 08:30 AM

I've been very happy with Pelican as a blog engine so far, and haven't even moved off the sample theme. There's just been one problem: Myself and others have had a lot of trouble reading the code snippets.

I didn't have time to cook up a full Pelican theme, so instead I just hacked the local CSS files. The problem with this hack is that every time I regenerate the blog I have to copy the right CSS files into place. So next week when I have time I'll do a proper Pelican theme.

In the meantime, enjoy!

from random import shuffle

class Meal(object):
    def __init__(self):
        self.food_type = ['Beef', 'Fish', 'Vegetarian', 'Chicken']
        shuffle(self.food_type)

May 10, 2012

Secs sell! How I cache my entire pages (server-side)

From Peter Bengtsson on May 10, 2012 05:42 PM

I've blogged before about how this site can easily push out over 2,000 requests/second using only 6 WSGI workers excluding latency. The reason that's possible is because the whole page(s) can be cached server-side. What actually happens is that the whole rendered HTML blob is stored in the cache server (Redis in my case) so that no database queries are needed at all.

I wanted my site to still "feel" dynamic in the sense that once you post a comment (and it's published), the page automatically invalidates the cache and thus, the user doesn't have to refresh his browser when he knows it should have changed. To accomplish this I used a hacked cache_page decorator that makes the cache key depend on the content it depends on. Here's the code I actually use today for the home page:

def _home_key_prefixer(request):
    if request.method != 'GET':
        return None
    prefix = urllib.urlencode(request.GET)
    cache_key = 'latest_comment_add_date'
    latest_date = cache.get(cache_key)
    if latest_date is None:
        # when a blog comment is posted, the blog modify_date is incremented
        latest, = (BlogItem.objects
                   .order_by('-modify_date')
                   .values('modify_date')[:1])
        latest_date = latest['modify_date'].strftime('%f')
        cache.set(cache_key, latest_date, 60 * 60)
    prefix += str(latest_date)

    try:
        redis_increment('homepage:hits', request)
    except Exception:
        logging.error('Unable to redis.zincrby', exc_info=True)

    return prefix

@cache_page_with_prefix(60 * 60, _home_key_prefixer)
def home(request, oc=None):
    ...
    try:
        redis_increment('homepage:misses', request)
    except Exception:
        logging.error('Unable to redis.zincrby', exc_info=True)
    ...

And in the models I then have this:

@receiver(post_save, sender=BlogComment)
@receiver(post_save, sender=BlogItem)
def invalidate_latest_comment_add_dates(sender, instance, **kwargs):
    cache_key = 'latest_comment_add_date'
    cache.delete(cache_key)

So this means:

  • whole pages are cached for long time for fast access
  • updates immediately invalidates the cache for best user experience
  • no need to mess with ANY SQL caching

So, the next question is, if posting a comment means that the cache is invalidated and needs to be populated, what's the ratio of hits versus hits where the cache is cleared? Glad you asked. That's why I made this page:

www.peterbe.com/stats/

It allows me to monitor how often a new blog comment or general time-out means poor django needs to re-create the HTML using SQL.

At the time of writing, one in every 25 hits to the homepage requires the server to re-generate the page. And still the content is always fresh and relevant.

The next level of optimization would be to figure out whether a particular page update (e.g. a blog comment posting on a page that isn't featured on the home page) should or should not invalidate the home page. esp

Choosing an API framework for Django

From Daniel Greenfeld on May 10, 2012 08:00 AM

First off, out of the box, Django lets you construct API responses with a little work. All you need to do is something like this:

# Copied from https://docs.djangoproject.com/en/1.4/topics/class-based-views/#more-than-just-html
from django import http
from django.utils import simplejson as json

class JSONResponseMixin(object):
    def render_to_response(self, context):
        "Returns a JSON response containing 'context' as payload"
        return self.get_json_response(self.convert_context_to_json(context))

    def get_json_response(self, content, **httpresponse_kwargs):
        "Construct an `HttpResponse` object."
        return http.HttpResponse(content,
                                 content_type='application/json',
                                 **httpresponse_kwargs)

    def convert_context_to_json(self, context):
        "Convert the context dictionary into a JSON object"
        # Note: This is *EXTREMELY* naive; in reality, you'll need
        # to do much more complex handling to ensure that arbitrary
        # objects -- such as Django model instances or querysets
        # -- can be serialized as JSON.
        return json.dumps(context)

Once you get that mixin, use it in your views like so:

# modified from djangoproject.com sample code
from django.utils import simplejson as json

class JSONDetailView(JSONResponseMixin, MyCustomUserView):
    def convert_context_to_json(self, context):

        context['objects'] = User.objects.values('first_name','last_name','is_active')
        return json.dumps(context)

This works pretty well in a number of simple cases, but doing things like pagination, posting of data, metadata, API discovery, and other important things ends up being a bit more work. This is where the resource oriented API frameworks come in.

What makes a decent API Framework?

These features:

  • pagination
  • posting of data with validation
  • Publishing of metadata along with querysets
  • API discovery
  • proper HTTP response handling
  • caching
  • serialization
  • throttling
  • permissions
  • authentication

Proper API frameworks also need:

  • Really good test coverage of their code
  • Decent performance
  • Documentation
  • An active community to advance and support the framework

If you take these factors, at this time there are only two API frameworks worth using, django-tastypie and django-rest-framework.

Which one is better? django-tastypie or django-rest-framework?

I say they are equal.

You simply can't go wrong with either one. The authors and communities behind both of them are active, the code is solid and tested. And here are my specific thoughts about both of them:

django-tastypie

Using django-tastypie is like playing with pure Python while using the Django ORM. I find it very comfortable. Seems really fast too. The documentation is incredible, and I rarely have any problems figuring anything out. It also supports OAuth 1.0a out of the box, which is mighty awesome these days.

In fact, I wrote a custom OAuth2 handler for django-tastypie for consumer.io that I'm working to extract for publication.

django-rest-framework

As it's based off Django 1.3 style Class Based Views (CBVs), it has a very familiar pattern. Actually, because of the quality of the documentation, I really prefer using django-rest-framework CBVs more than using Django's actual CBVs.

Maybe I should make an HTML renderer for django-rest-framework? :-)

But what about django-piston?

Don't use django-piston.

I don't want to say anything negative, but let's face it: django-piston is dead. Besides a critical security release last year, nothing has been done for it in about 3 years. The documentation is weak, the code mostly untested, and the original author left. He has gone on to do some amazing things. Django-piston was amazing in its time, but its time has passed and so should you.

The only reason for using django-piston for years has been that it supported OAuth, but django-tastypie now addresses that use case. I've used django-tastypie's basic OAuth class and rolled custom Authentication modules to support some extra OAuth flavors and found it wonderful.

Use django-tastypie or django-rest-framework instead. You'll be much, much happier for it.


Discuss this on Hacker News

May 09, 2012

Django Requirements for a project

From Daniel Greenfeld on May 09, 2012 08:00 AM

Today I'm starting a new project. I'm working as fast as I can and hope to launch on Friday. What are my package dependencies?

Django==1.4

Unlike my last quick project which was Flask, this effort really falls into Django's sweet spot. I need sessions, forms, templates, and models to do things in an ideal Django pattern.

psycopg2==2.4.5

I need transactions and hard-type validation in the database, which means PostgreSQL. If I didn't need transactions or the hard-type validation I would consider MongoDB instead.

django-debug-toolbar==0.9.4

Because not using this tool is insane.

django-extensions==0.8

Because amongst other things this library gives you, I never want to write my own TimeStampedModel ever again. :-)

South==0.7.5

Django gives you the freedom to migrate data in the way you want. The way I want to do it is via South.

django-registration==0.8.0

Normally django-social-auth is my go-to tool for registration, but in this case I need simple username/password registration. This is a very solid tool, but you do have to make your own templates or find someone's fork that has a copy of templates that match.

django-floppyforms==0.4.7

An excellent tool for making your forms HTML5-ish out of the box.

django-crispy-forms==1.1.3

The child of my own django-uni-forms, this will let me create forms using div-based controls super fast, and do layout customizations if I need them.

django-heroku-postgresify==0.2

This tool makes getting the PostGreSQL settings out of Heroku trivial.

django-heroku-memcacheify==0.1

This tool makes getting the memcache settings for Heroku trivial.

gunicorn==0.14.2

All the cool kids who play in devops swear by Gunicorn. I use it because Heroku seems to recommend it for Django deployments.


Installing the above packages

Never copy/paste these libraries directly into your projects. If you do that, you'll end up hating yourself later as your local instances become unmaintained forks of the real project. Also, unless you are really careful in your copy/pasting, you'll be in violation of various open source licenses. Odds are the FOSS police aren't going to find you, but I can assure you that when you bring in one of the authors of these packages to help you fix a problem he/she is going to be mighty annoyed at the lack of attribution.

Do it the right way: do proper Python dependency management.

Create a requirements.txt file and install them as proper dependencies. The file should contain the following text:

Django==1.4
South==0.7.5
django-crispy-forms==1.1.3
django-debug-toolbar==0.9.4
django-extensions==0.8
django-floppyforms==0.4.7
django-registration==0.8.0
django-heroku-memcacheify==0.1
django-heroku-postgresify==0.2
django-registration==0.8.0
gunicorn==0.14.2
psycopg2==2.4.5

Once you have that, you install them thus in your virtualenv:

pip install -r requirements.txt

Now that I have all this, it's time to code!


http://farm5.staticflickr.com/4027/4358842735_38991c0944.jpg

May 08, 2012

Moving to Sentry

From Andy McKay on May 08, 2012 05:52 AM

Back in the mists of time I worked at company that had lots of projects and lots of errors, but no unified way to find the errors. Instead we relied on the users to find them, this sucked. So I wrote Arecibo to track them. It went through many iterations, I even tried to make a business out of it at one point.

But alas it didn't happen and I moved it to open source a while back. If there was any value in the project someone could move it along. I've rewritten it from a multi-home app to a single home App Engine app and then back away from App Engine. A while back we turned it on for Mozilla and hooked up a few projects to it.

The only real problem is that I lost interest in working on it a while ago. I've really only been maintaining it and hoping someone else would pick it up.

A few years ago David Cramer started Sentry. It's now surpassed passed Arecibo in terms of functionality. The real winner for us was the addition of UDP support, we don't really care about storing every single error and having something non-blocking is crucial. So we've started to shift away from Arecibo towards to Sentry at Mozilla web development.

A few weeks ago we switched Add-ons to Sentry. By a happy coincidence, last week we had a bot that hit an error and we generated 379,000 errors in about 24 hours. Sentry trundled along quite happily. Of course Arecibo isn't dead, it's open source, it's there if you need it. And being open source, may the best project win - and that's Sentry.

There, that's one more project I don't have to feel guilty about not maintaining.

May 04, 2012

Distributing Work in Python Without Celery

From David Cramer on May 04, 2012 10:12 PM

We’ve been migrating a lot of data to various places lately at DISQUS. These generally have been things like running consistancy checks on our PostgreSQL shards, or creating a new system which requires a certain form of denormalized data. It usually involves iterating through the results of …

"Day against DRM": pragmatic programmer books are always DRM free

From Reinout van Rees on May 04, 2012 01:57 PM

Today, 4 May, is day against DRM.

The publisher I'm writing my Django book for, the pragmatic bookshelf, is fully DRM free. Lovely. You get your books in a number of formats (I use epub and pdf) without any restriction. Well, they generate the book with your name tucked away somewhere in the header so you cannot post it on some FTP site and get away with it :-)

A nice feature they added a couple of months ago: dropbox integration. Any book you buy is automatically added to your dropbox account! No need to keep things in sync yourself.

So... I don't need to worry that my eBook purchases disappear into some DRM-locked no-longer-supported device. Make sure your books are DRM free, too!

May 03, 2012

Using Travis-CI with Python and Django

From David Cramer on May 03, 2012 06:13 PM

I’ve been using Travis-CI for a while now. Both my personal projects, and even several of the libraries we maintain at DISQUS rely on it for Continuous Integration. I figured it was about time to confess my undenying love for Travis, and throw up some notes about the defaults we use in our …

On Fixtures and Factories

From Peter Baumgartner on May 03, 2012 04:19 PM

We’ve made it a general rule to move away from relying on fixtures in our projects. The main reasons are…

May 02, 2012

May 01, 2012

Three things you should never put in your database

From Revolution Systems on May 01, 2012 09:24 PM

As I've said in a few talks, the best way to improve your systems is by first not doing "dumb things". I don't mean you or your development staff is "dumb", it's easy to overlook the implications of these types of decisions and not realize how bad they are for maintainability let alone scaling. As a consultant I see this stuff all of the time and I have yet to ever see it work out well for anyone.

Images, files, and binary data

Your database supports BLOBs so it must be a good idea to shove your files in there right? No it isn't! Hell it isn't even very convenient to use with many DB language bindings.

There are a few of problems with storing files in your database:

  • read/write to a DB is always slower than a filesystem
  • your DB backups grow to be huge and more time consuming
  • access to the files now requires going through your app and DB layers

The last two are the real killers. Storing your thumbnail images in your database? Great now you can't use nginx or another lightweight web server to serve them up.

Do yourself a favor and store a simple relative path to your files on disk in the database or use something like S3 or any CDN instead.

Ephemeral data

Usage statistics, metrics, GPS locations, session data anything that is only useful to you for a short period of time or frequently changes. If you find yourself DELETEing an hour, day, or weeks worth of some table with a cron job, you're using the wrong tool for the job.

Use redis, statsd/graphite, Riak anything else that is better suited to that type of work load. The same advice goes for aggregations of ephemeral data that doesn't live for very long.

Sure it's possible to use a backhoe to plant some tomatoes in the garden, but it's far faster to grab the shovel in the garage than schedule time with a backhoe and have it arrive at your place and dig. Use the right tool(s) for the job at hand.

Logs

This one seems ok on the surface and the "I might need to use a complex query on them at some point in the future" argument seems to win people over. Storing your logs in a database isn't a HORRIBLE idea, but storing them in the same database as your other production data is.

Maybe you're conservative with your logging and only emit one log line per web request normally. That is still generating a log INSERT for every action on your site that is competing for resources that your users could be using. Turn up your logging to a verbose or debug level and watch your production database catch on fire!

Instead use something like Splunk, Loggly or plain old rotating flat files for your logs. The few times you need to inspect them in odd ways, even to the point of having to write a bit of code to find your answers, is easily outweighed by the constant resources it puts on your system.

But wait, you're a unique snowflake and your problem is SO different that it's ok for you to do one of these three. No you aren't and no it really isn't. Trust me.

django-dynamodb-sessions 0.5 released

From Greg Taylor on May 01, 2012 08:09 PM

I have released django-dynamodb-sessions 0.5 today, addressing an issue with session keys. All users of previous versions are encouraged to upgrade. Thanks goes to Adam Nelson for pointing this out.

Weekly Training Site

From Wraithan on May 01, 2012 05:03 PM

I blogged last month about my goals and where I wanted them to be. In response to that I built a site that helps me track where I am for the 7 days trailing. That site can be found at http://training.wraithan.net/.

Users log into my site via Dailymile using OAuth2 since I need to get their API token in order to collect their workouts and display them. You can see my profile at http://training.wraithan.net/profile/Wraithan. At time of writing I am nearing my goal for biking but my running and hiking have suffered.

I built this site using Django, I did all the OAuth2 stuff myself because when I last surveyed the existing work with OAuth2 and Django, I found I would have to write my own. I turns out it is pretty simple, and because of the many drafts that exist, it would be pretty trying to have a more generic app for this. This will hopefully change when OAuth2 is finalized.

Dailymile's API has some warts but it is usable and they were rather responsive when I had some requests for features and the one or two bugs I ran into. Plus their terms of service for their API are really reasonable. I can't say the same about other workout tracking sites I looked into, either they had a horrible ToS or they plain didn't have an API.

Los Angeles Open Source Sprint on May 12th!

From Daniel Greenfeld on May 01, 2012 09:20 AM

http://farm9.staticflickr.com/8022/7132778527_6e3b49b313_o.png

This is a day long coding event in Los Angeles for Open Source developers of all languages and skill levels to come and code like fiends. They'll be joined by dozens of either really smart coders or nice people like me. Sponsors are providing food, drinks, venue, and more!

RSVP at http://www.meetup.com/LA-Hackathons/events/62796642/ before it fills up! It's free.

I'll be there to:

  • Organize the event with the assistance of the awesome Los Angeles technical community!
  • Code like a fiend. I want to work on django-mongonaut and could use some GraphViz and JavaScript help.

And now to open the floor to questions...

Where and when?

Where:

Spire.io
7257 Beverly Blvd #210
Los Angeles, CA 90036

When:

May 12, 2012
10 AM to 10 PM

Is this like a Hackathon?

Yup. See http://en.wikipedia.org/wiki/Hackathon#Sprints

Will there be Wifi?

Yes!

I'm just starting as a developer, should I come?

It depends.

If you've never coded before, this isn't the right place. Instead, you might consider one of the local coding workshops or classes. In fact, here's a good bi-weekly hack night / study group for you.

If you've done a tutorial or two, sprints can be a great way to learn new skills or hone your technique by sitting alongside experienced developers who actually need your help. A lot of projects have what are called 'low hanging fruit', which are 'simpler' tasks saved for beginner developers to wet their teeth on. Things I've learned at events like these include Git, Mercurial, JQuery, and a hundred other things that have made me a better coder.

What if I don't have a project of my own to bring? Should I come?

Heck yeah! There will be a number of projects around that you can join and contribute to in order to make the world a better place. There isn't a list up yet, but I'm hoping by Saturday there will be one.

What if I want to come and recruit people?

Absolutely not.

This is not a job fair and we don't want unnecessary distractions.

On the other hand, if you want to help sponsor we'll happily mention you on the meetup.com description.

Are there going to be any presentations or lightning talks?

No.

This is a sprint, not a conference or demonstration. We'll try and limit announcements and interruptions as much as possible, the only exception being for letting you know food has arrived.

What should I bring?

Your own functioning laptop with power cord. Neither event organizers, the venue, or sponsors are providing equipment. We also encourage you to bring a power strip labeled with your name.

I'm sold! How much does it cost and where do I register?

The event costs you nothing and you RSVP at http://www.meetup.com/LA-Hackathons/events/62796642/.

April 30, 2012

Integration of Backbone.js with Tastypie

From Patrick Altman on April 30, 2012 12:51 PM

I recently started learning more about backbone.js as a way to create richer app-like experiences on the web without the kludge that results from creating DOM elements in jQuery. I bought all three PeepCodes on the topic and have watched them all now a few times (they are densely packed with good material).

PeepCode Screencasts:

We have been doing more and more of not only these richer app-like user interfaces on the web, but also API-backed iOS apps that extend the reach and functionality of the web application. Therefore, something like backbone.js just makes sense as we can share the API for both the web application as well as the iOS app.

When I first got started, it became obvious to me pretty quickly that backbone.js shipped with a lot of default assumptions about how the REST API worked (predicted url paths, data structures, and so on).

I am a big fan of Tastypie, in fact, all of us at Eldarion are. It's dead simple to extend and customize, enabling us to quickly and easily add APIs to sites.

There were a couple of small things that didn't work out of the box with Tastypie and so after some searching I found a js shim that fixed that (well kind of, there were a few bugs related to creating objects), called backbone-tastypie.js.

I didn't feel up to fixing the bugs in the backbone-tastypie. In addition, I wanted to get rid of depending on an extra third party javascript include. The things that were missing in terms of causing an incompatibility between backbone.js ans Tastypie were so minor, it didn't seem worth it to have this shim. Being a Python developer more than a javascript developer, I wanted to customize how things in Python.

I was able to do this with one simple base class that all my ModelResource classes implement:

class BackboneCompatibleResource(ModelResource):

    class Meta:
        always_return_data = True

    def alter_list_data_to_serialize(self, request, data):
        return data["objects"]

With this and the setting of always_return_data = True in the Meta class of each ModelResource, everything was working great without the need for the buggy js shim.

I tweeted a pronouncement of solving this great and mysterious problems and it actually garnered a number of replies from people wanting to know how I solved such a problem. These were replies from people that I respect a great deal and therefore caused me to question myself. Did I really solve this "problem" so easily? What was I missing? There had to be something I was missing.

Then I got a message from Daniel Lindsley, author of Tastypie (as well as many other wonderful open source apps). Turns out, while what I had done will work, it does cause you miss out on the list metadata that Tastypie provides. Without this metadata, you have to make non-obvious assumptions about how the API works.

Daniel offered a solution (in javascript) that removed the need for the shim but actually added that extra bit of meta data information to the backbone Collection objects which can be used to facilitate smarter paging of data through the API. In addition, there is a bit of code to make sure the URLs have slashes appended to the URL therefore avoiding the need for a redirect on every API call.

window.TastypieModel = Backbone.Model.extend({
    base_url: function() {
      var temp_url = Backbone.Model.prototype.url.call(this);
      return (temp_url.charAt(temp_url.length - 1) == '/' ? temp_url : temp_url+'/');
    },

    url: function() {
      return this.base_url();
    }
});

window.TastypieCollection = Backbone.Collection.extend({
    parse: function(response) {
        this.recent_meta = response.meta || {};
        return response.objects || response;
    }
});

Now I just add this bit of javascript to my site's JS and extend TastypieModel and TastypieCollection and everything magically just works. Note, you still need to set always_return_data = True on every ModelResource that you plan to handle POST methods for creating objects. I much prefer to have these additional classes than something that extended/modified standard bootstrap.

I am still learning a ton, but having fun in the process. If you see that I am doing somethign wrong here, please don't hesitate to point it out to me—I welcome the criticism/correction.

Einladung zur Django-UserGroup Hamburg am 09. Mai

From Arne Brodowski on April 30, 2012 10:46 AM

Das nächste Treffen der Django-UserGroup Hamburg findet am Mittwoch, den 09.05.2012 um 19:30 statt. Wie bei den letzten Malen treffen wir uns wieder in den Räumen der CoreMedia AG in der Ludwig-Erhard-Straße 18 in 20459 Hamburg (Anfahrtsbeschreibung auf Google Maps).

Bitte am Eingang bei CoreMedia AG klingeln, in den 3. Stock fahren und oben am Empfang nach der Django-UserGroup fragen.

Da wir in den Räumlichkeiten einen Beamer zur Verfügung haben hat jeder Teilnehmer die Möglichkeit einen kurzen Vortrag (Format: Lightning Talks oder etwas länger) zu halten. Die meisten Vorträge ergeben sich erfahrungsgemäß vor Ort.

Eingeladen ist wie immer jeder der Interesse hat sich mit anderen Djangonauten auszutauschen. Eine Anmeldung ist nicht erforderlich, aber hilfreich für die Planung: Doodle Kaldender.

Weitere Informationen über die UserGroup gibt auf unserer Webseite www.dughh.de.

April 29, 2012

Django versus ajax or together with ajax?

From Reinout van Rees on April 29, 2012 09:04 PM

Design decision needed! I'm breaking my head over our user interface. We make quite elaborate geographical water-related websites at work (you can see a screenshot at http://lizard.org/). A heading with some navigation, a sidebar, a map (and possibly text and graphs in addition to it), some UI buttons here and there, that sort of stuff.

The question: how much ajax should we use? This is also an important question for Django as a whole. I remember Jacob's keynote in Berlin in 2010 where he effectively said that all those new-fangled dynamic internet thingies were a potential risk to Django:

The web is in flux. NoSQL is taking up fast. Which can be a problem with Django’s original sql-only approach. And what about html5, client side storage, etcetera? It is in any case a challenge of what we know. Jacob’s tip: challenge yourself by examining all those nifty new scary technologies.

I might have paraphrased him a bit to much, but at work I see how much of Django can possibly be displaced by javascript. There are basically three possibilities for our websites:

  • Just use Django and plain html with a bit of javascript for a dropdown menu or some dynamic list sorting.
  • Go the full hog and make a completely javascript-based "single page app". Django is little more than a glorified ORM with a REST interface that way.
  • Use something in-between, like using javascript to refresh portions of the html page instead of reloading the full page.

We're currently using the in-between version. Often we just re-load the current page in javascript, take out specific divs and replace the corresponding browser page's divs with them. It works fine, it is easy to understand and it works fast enough. But it makes some of my colleages cringe.

It makes them cringe as it (probably) feels so old-fashioned to them. Why inject html when you can just grab some json and render that with some javascript-based template language? Why not use backbone.js for the whole website? For an example of great new-fangled web goodies, see my colleague's weblog at http://weblog.nyholt.nl/ . Highly recommended.

My thoughts are probably a bit old-fashioned. But I'd also like to think my thoughts still make solid engineering sense. Here are some of them:

  • Everyone on the team can understand Django's urls. Even if we tie lots of apps together. If we use backbone, we've got two urls.py files to handle: Django's and backbone's. And Django is set up for multiple apps (which we have) and I'm not so sure about backbone.
  • Everyone understands how Django gets from a URL to a view+template and how to build it. Just visit it in your browser and make sure your html is OK. And fix and debug it. With a fully javascript page, debugging an individual json request might be even simpler. But it is a couple of extra steps before you understand where it eventually ends up in the user interface. You've got to read the html bare-bones template and the javascript that requests the json and you need to understand where which part of your javascript renders your json content with which javascript template and when/why/how it requests it.
  • You want DRY (don't repeat yourself), so you cannot make both a django template and a json one. You can have only one, so you must have an empty bare-bones html page. So it means a couple of extra requests to build up the entire page. They might be quick, but it are a couple of extra requests.
  • Just fetching one sidebar is fine, but often (in our case) you need to update other page elements as well. So do you request a json for every one of the page elements (map buttons, print links, breadcrumbs)? Or a special UI json? Or do you put that info in the "main" json you request, even though that means hard-coding UI information into what's supposed to be plain data json?

So... is it still OK to load HTML fragments into an existing page with javascript? It is a step better than just reloading the entire page. But it doesn't even come close to a nice modern backbone-powered fully dynamic single page app... But the slightly-more oldfashioned approach might be easier to work with. And closer to Django, which is easier to work with and easier to understand than a possibly tangled mess of javascript. (Note: I do intend to use backbone.js to clean up and organize some of our existing javascript; backbone.js itself looks fine, I just dare question its broader use).

Opinions? Tips? Can you suggest a different way that I should look at this problem?

Photo & Video Sharing by SmugMug

Moving Django to GitHub: the postmortem

From Adrian Holovaty on April 29, 2012 12:12 AM

We finally moved Django to GitHub late yesterday. Here's a postmortem, to keep the community updated and for the benefit of any projects that take this leap in the future.

Background

We've used Subversion to manage our code since originally open-sourcing in July 2005. Over the last few years, we started to feel Subversion's limitations, namely:

  • The difficulty of branching. We used tools like svnmerge to keep track of which parts of branches had been updated from trunk, and some of us on the core team used Git/Mercurial on top of Subversion, but this was all unnecessarily complicated -- to the point of being stifling.
  • Lack of decentralization. When I would hack on Django on an airplane, for example, I couldn't make a bunch of commits locally, then push all of those to the master repository; I'd have to put everything in a single commit. With Subversion, it's all or nothing -- you push everything to a centralized server as you do it (or you use a branch, but that's painful, as noted above).
  • Slowness. After you use Git for a while, Subversion feels sluggish. This is due to a bunch of design and implementation differences.

(Of course, it's 2012 now, and these are all obvious, well-documented points. To the people who responded to our GitHub news by saying "finally!" -- I totally agree.)

Aside from that, we had set up a GitHub mirror (now called django-old) a few years ago, and lots of people were getting code and forking it there anyway.

Why Git/GitHub, as opposed to Mercurial/Bitbucket or some other system? Because it's very well-made, and it's where the people are. Clearly GitHub has won the majority of open-source developers' mindshare. John Lennon said: "If I'd lived in Roman times, I'd have lived in Rome. Where else?" GitHub is Rome.

The authors file

The first thing we considered was to simply start using our existing GitHub mirror -- turn off the Subversion stuff and start committing there directly. But the problem there was that we'd never set up an authors file.

Basically, an authors file maps Subversion committer names to standard names and email addresses, so that GitHub knows that a commit by "adrian" in Subversion maps to the adrianholovaty GitHub account. With that mapping established, you get niceties like GitHub commits linking to appropriate GitHub user pages and displaying proper user avatar images. More importantly, it gives all of our contributors proper credit within the GitHub ecosystem for the full history of their work on Django -- which has value these days, considering companies are looking at GitHub involvement for job applicants, etc.

So the first step was creating that authors file, which Brian Rosner organized, with the help of several other people. We ended up accounting for every one of the 58 people who have ever committed to Django, except for somebody named "cell" who was given temporary commit access during a sprint six years ago.

One crucial detail is that we couldn't simply change the commit data retroactively in the existing GitHub repository. That's because Git uses the committer data in creating hashes. Changing the commit data would change the hashes, which would break all existing forks of that repository. (We ended up breaking existing forks anyway, of course, but it was cleaner to do it from scratch.)

Nuts and bolts of the process

Once we finalized the authors file, doing the migration was actually kind of easy, thanks to git-svn. I took many missteps along the way, got a lot of help from people in #django-dev on IRC and ended up doing three dry runs. Here are the final steps I ended up taking:

1. Copied the Subversion repository from code.djangoproject.com to my laptop, to make the migration faster.

# On the server:
svnadmin dump /home/svn/django | gzip > svndump.gz

# On my laptop:
scp djangoproject.com:svndump.gz .
gunzip svndump.gz
svnadmin create /Users/adrian/code/django-svn
svnadmin load /Users/adrian/code/django-svn < svndump

On my first run of git-svn, I ran it from my laptop and pointed it at code.djangoproject.com, and it took 3.5 hours! After I copied the repo to my laptop and tried it again, it took a little over an hour. But the caveat here is that I also changed the git-svn command between those two runs, so I'm not sure how much of the speed improvement was because of the local SVN repo.

2. Ran git-svn (with the correct arguments!).

git svn --authors-file=authors.txt --trunk=trunk clone file:///Users/adrian/code/django-svn/django/ django-dry-run

This took a little over an hour, and it created a Git repository called django-dry-run. Note that authors.txt is the authors file, as explained above.

The trickiest thing about this was determining the correct arguments to use -- specifically, whether to use --branches explicitly or --stdlayout. As you can see, I ended up using neither.

Originally, the plan was to migrate all of the branches from our Subversion history -- classics such as magic-removal, new-admin, newforms-admin, unicode, queryset-refactor and multidb -- so that the branches' commit histories (which have all since been merged to trunk) could be preserved in our new Git history. Many of those branches were very involved, with a lot of commits, and there's a lot of value in being able to isolate specific commits in the branch, rather than one large merge commit. (Imagine you're investigating the original reason we added a line of code, for example.)

But as we discussed this over IRC, we decided it wasn't worth the effort, we could always do it later and git-svn wouldn't actually do it the way we wanted. Ideally, I'd like these branches' histories to be migrated such that they're treated like merged branches in Git -- a merge commit that knows the individual commits on the branch. If you know how to pull this off, and it can be done without altering the Git hashes, please let me know.

3. Changed git-svn-id to point at code.djangoproject.com instead of my laptop.

git filter-branch --msg-filter "sed \"s|^git-svn-id: file:///Users/adrian/code/django-svn/django/trunk|git-svn-id: http://code.djangoproject.com/svn/django/trunk|g\"" -- master

git-svn adds a "git-svn-id" section to each commit message in the resulting Git repository. It includes a URL pointing to the commit in the original Subversion repository, which is very useful.

But, because I did the import from a local repository, the git-svn-id's were all pointing at my laptop. So I ran git filter-branch to clean it up.

4. Renamed old GitHub django repository to django-old.

(Done via the GitHub Web site.) This was the scary part, because it meant there was no turning back. :-)

Originally we'd talked about deleting the repository outright, but that would have deleted all pull requests and likely would have broken some other things. So I just renamed it to django-old. Not sure how long we'll keep this around.

4. Imported the new repository into GitHub.

git remote add origin git@github.com:django/django.git
git push -u origin master

I spotted an error in the repository after the first time I did it, so I had to delete it -- which I thought made for a rare and amusing screenshot:

Screenshot of GitHub deletion step

Then I cleaned up the repository and did it again. I mistakenly created it as a private repository, so I marked it as public, which led GitHub to believe I had just open-sourced Django. :-)

Screenshot of GitHub upload

And that's it!

Stats

For posterity:

  • Final number of commits in our Subversion repository: 17,942.
  • Size of Subversion repository: 339 MB gzipped. (That's for the dump file as generated by svnadmin dump.)
  • Number of commits created in Git by git-svn: 11,883. (This is less than 17,942 because we only migrated trunk. Any commit to our repository that didn't touch Django trunk -- such as commits to the django_website project or commits to branches -- did not get migrated.)
  • Number of forks of the old (mirror) GitHub repository, as of this writing: 783.

Going forward

  • The old Subversion repository will remain indefinitely, for the benefit of scripts out there that do automatic updates, and general stability of the Django world. There won't be any more commits there, obviously.
  • If we ever need to dive into the history of one of the big merged branches -- such as magic-removal -- we can do so in the Subversion history. Or we can consider copying the branch history into Git somehow (see above).
  • I'd like us to provide some documentation on how to convert your previous Django fork (from the django-old repository) to track the new repository. Any volunteers?
  • We still have a bunch of work to do fixing places in our documentation and code.djangoproject.com that refer to Subversion. Bear with us.

Filing bugs / pull requests / the ticket system

GitHub's ticket system is a bit too simple for our needs, given the Django triage process, so we're sticking with our Trac installation, at least for the time being.

But, of course, we want to take advantage of GitHub pull requests at the same time. So we'll need to figure out the right balance between pull requests and Trac tickets, such that we maintain our sanity, we don't make people jump through hoops, and we optimize for contributor and committer productivity.

Personally, I want to avoid a situation (and culture) where we force contributors to use Trac if they post pull requests, especially ones that contain trivial changes. But at the same time, it'll likely become a maintenance nightmare if we have lots of tickets in two places, with no coordination. So, this is an open issue we'll be working to figure out. Jacob has been working on a technological solution.

Thanks to all the people who helped with this transition, and I look forward to the much happier development and collaboration experiences we get with GitHub. The commits and pull requests I've already handled have been a pleasure.

April 27, 2012

Whiskers and buildout.sendpickedversions

From Mark van Lent on April 27, 2012 02:01 PM

Last year I participated in a deployment knowledge sharing session and I started implementing changes at my company pretty soon after. The result is that we are using Puppet for some parts of our server configuration. We also added Munin to our monitoring toolset (and I used Puppet to deploy Munin and manage its configuration). But an important piece that was still missing in our setup was an overview of which packages we use in the buildouts of our clients and more specifically which version each client uses.

Apparently I was not the only one that wanted to have such an overview: Jukka Ojaniemi created Whiskers (PyPI, GitHub) and released version 0.1 in December 2011. Whiskers is a Pyramid application and it is intended to be used in combination with the buildout extension buildout.sendpickedversions (PyPI, GitHub).

Setting up Whiskers is very simple (see the Whiskers README for details) and since the data is stored in an SQLite database there is little infrastructure needed. The buildout side is even less work, just only have to add the following:

[buildout]
...
extensions += buildout.sendpickedversions
buildoutname = <buildout-name>
whiskers-url = <whisker-server-url>/buildouts/add

And the result after modifying several buildout configurations is a nice overview of which packages (and versions) are used by each buildout.

Buildout details

But you can also view a package and see which versions are used in which buildouts.

Package details

For the Edition1 Whiskers server, I wanted to change the CSS to make the header and footer match our company colors and change the used font. Perhaps Pyramid provides a solution to override static files included in a package, but I chose to copy the whiskers.css file to another directory, modify it and have Apache serve my file.

Note that currently Whiskers has some rough edges. For instance: not all packages are registered properly. I am using a checkout of my fork for now until there is a new release where this is fixed (yes, I issued a pull request).

The package view (which was shown in the second screenshot), currently does not sort the versions and does not hide versions that are not used by any buildout. I personally don't like that so I issued another pull request in the hope it will be included in a next release.

Although Whiskers may not be perfect yet, I quite like it and am happy that I finally took the time to set things up.

Update (2012-04-28): Both issues are solved in version 0.2. Which means I can recommend Whiskers even more. :)

django-dynamodb-sessions 0.4 released

From Greg Taylor on April 27, 2012 01:40 AM

django-dynamodb-sessions 0.4 has been released. The only change made is to add Django 1.4 compatibility, courtesy of Adam Nelson. See the PyPi page for more details, and of course, follow the project on GitHub!

April 26, 2012

Gerbi CMS

From Ian Ward on April 26, 2012 09:12 PM

Gerbi CMS (nee django-page-cms) is a multilingual content management system written in Python and based on the Django web framework. It's currently my favourite CMS software and use it for a number of web sites I administer.

I'll be giving a talk about Gerbi CMS at the next OCLUG meeting resembling this article.

April 25, 2012

Sticking With Standards

From David Cramer on April 25, 2012 05:23 AM

More and more I’m seeing the “requirements.txt pattern” come up. This generally refers to projects (but not just), and seems to have started around the same time as Heroku adopting Python. I feel like this is something that matters in the Python world, and because I have an …

April 24, 2012

Class Based Views Part 3: DetailView and template_name Shortcut

From GoDjango on April 24, 2012 08:00 PM

The DetailView is an important class based view since it allows us to show off details of our data instead of just bits here and there. It is also very simple to use and will save you time. In this video you are also going to see a nice little shortcut with your templates to save you from writing a couple of extra lines.
Watch Now...

April 23, 2012

Ginger Tech Stack

From Peter Baumgartner on April 23, 2012 04:20 PM

April 20, 2012

First presentation of Skeltrack

From Joaquim Rocha on April 20, 2012 09:40 PM

I spent the first half of this week in the beautiful city of Évora, where I was born. The occasion was the Semana da Ciência e Técnologia (Science and Technology Week) of the University of Évora to which I was invited.
I also ended up giving the organization a hand by asking Thomas Perl (the restless mind behind gPodder) and Lucas Rocha (well known GNOME developer now using his powers in Mozilla) who kindly accepted.

Having participated in the organization of events during the University, I’m always happy to see these initiatives taking place.
It was also great to spend a couple of days with the folks at my University and meet with old friends.

About the talks, Thomas gave an overview of gPodder and the infrastructure used to manage the project. Lucas gave a really nice talk about what Mozilla is, what it does and why you should care; because of it, I ended up installing Firefox Mobile nightly build for Android and it has improved a LOT.
My friend Luís Rodrigues (no blog because he’s a badass) talked about CERN, where he works. What an amazing place! He talked about how much CERN uses Python and Django to manage their data. As a Python lover, this makes me really happy.

This was also the first time I presented Skeltrack, my latest creation inside Igalia. Presenting such an algorithm is not an easy job so I took mental notes about what to improve the next time (which will be at LinuxTag) but I was happy that people made good questions about it.

I’d like to thank to the AAUE (Students Association) for the great time we all spent in there.

Presentation slides :

LearnScripture.net launched

From Luke Plant on April 20, 2012 09:46 AM

I've launched a new Bible memorization service, LearnScripture.net.

It was inspired by the great language learning website memrise.com, and by a series that I was doing on Jesus' use of the Bible.

I designed it with my congregation in mind, so it's child friendly, and also mobile-phone friendly, as well as being socially-oriented. Another unique selling point is good support for learning extended passages, rather than just individual verses.

It's Django-powered, of course, and there are tons of people to thank. I'd particularly like to thank the many people who worked on Django ticket 2879 - live server support which landed in Django 1.4 - they are listed here, and especially Julien Phalip who pushed it through. My site uses a lot of javascript, and I think it simply wouldn't have been possible without being able to have Selenium tests integrated into my Django test suite.

Unlike memrise.com, who, according to their faq have no idea how they are going to make money, I decided to make this a paid service, so it will hopefully support itself and support me a bit too (I'm in a small church, so currently have to supplement my salary with freelance programming work).

You can follow it on twitter if you want to hear more.

April 19, 2012

Down the rabbit hole, profiling your Python code - Remco Wendt

From Reinout van Rees on April 19, 2012 12:34 PM

(Talk at the April 2012 Dutch Django meeting)

There's a lot happening between an incoming request and an outgoing response. Part of it is your code, part is in libraries. You don't care about most of those parts, as you probably mostly care about the resulting end product for the customer.

There is a lot of interest in scaling, but not so much in profiling your performance. Profiling means running your code in such a way that Python's interpreter gathers statistics on all the calls you make. This has a huge performance impact, so don't use it in production. But it gives you invaluable data on what's actually happening in your code.

The most interesting thing about profiling is the low hanging fruit. Often there are two or three expensive functions that you can easily improve: with a limited effort you get a lot of extra performance. It is not effective to focus on a hard problem that you can only improve 2%.

Python has lots of tools. The most well-known is cProfile (profile is not that good; hotspot seems deprecated). Line profiler looks at the number of times a line is executed.

Run cProfile like this:

import cProfile
cProfile.run('your_method()')

An alternative is:

python -m cProfile your_script.py -o your_script.profile

With that -o option you get an output file that you can run through Python's pstats to get the actual statistics.

A very handy visualizer is run snake run that displays the profiling information as a "tree map". An alternative is kcachegrind, but you need to call pyprof2calltree to convert Python's profiling information to kcachegrind's.

What to look for:

  • Things you didn't expect. Perhaps you spot something sub-obtimal or strange that needs investigating.
  • If there's much time spend in just one single function. This is possible low-hanging fruit.
  • Lots of calls to the same function.

Some things you can do to improve your performance:

  • Caching

  • Get stuff out of inner loops.

  • Remove logging. And especially watch out when logging database objects in Django: your objects's __unicode__() might call more than you want, like self.parent.xyz...

    Regarding debug logging: you can make them conditional with if __debug__:. Running python with -O optimizes them away.

Apart from code profiling (cpu/IO) there's also memory profiling for looking at memory usage. Small note on Django: it has an (intended) memory leak in debug mode (the query cache). No real problem, but keep it in mind when doing memory profiling.

Tools for memory profiling: heapy and meliea. Meliea is nice as you can run it on your server (ahem) and then copy it to your local machine for evaluation with, again, run snake run.

Profiling is all good and fun, but the environment is different on your production server. How to do profiling there? You might have one of several wsgi process that runs in profiling mode, for instance, with a load balancer that only trickles a few results to that single wsgi process.

Or you can use Boaz Leskes' pycounters, "instrumenting production code".

To close off: you should know about this. It should be part of your professional toolkit. And... it should be in IDEs. Several of them already have it. Komodo already has it, but what about PyCharm? Remco hopes that this blog entry sparks IDE vendors into action when needed :-)

Some input from the questions:

  • Django has profiling middleware that you can switch on for a specific request with a GET parameter.
  • There's WSGI middleware (like dozer).
Aankomst Eneco tour in Nieuwegein - 2

Django Generic Class Based View Tip

From Wraithan on April 19, 2012 01:00 PM

Those reading this via a feed reader will have to view the page on my blog as I am using embedded gists. I'll find a solution for that in the future.

So say you have a base template that looks something like:
And a template that looks like either of these:
It used to be that you could write something like this:
But generic function based views are deprecated and the world is being strongly urged to move to generic class based views. If you would like to get extra_content working with the direct_to_template replacement TemplateView, you can use a view like the following:
And a urls.py like the following using it:
The code used in this blog post can be found in this gist.

April 18, 2012

Lightning talks at the Dutch Django meeting (part two)

From Reinout van Rees on April 18, 2012 07:20 PM

Lightning talks at the April 2012 Dutch Django meeting

Customizing the Django admin interface - Arthur de Jong

They build http://publications.cta.int/ . They're using Django since 2006 and they love it. They have large customers and work on "business-critical IT systems".

http://publications.cta.int/ is a website. Django site that works quite well. The problem? The publications from that website are ordered by people from Africa. Which means a big multilingual need! 8k publications per month are ordered with 2.7M objects in the database.

He showed a slide with the libraries they were using. Wow, that was a whopping lot of open source code. (Personal note: scare my boss with that number; what we're doing is still pretty OK :-) )

Their project uses the Django admin quite a lot for managing, for instance, orders. But they wanted some extra javascript interaction, for instance for more easily deleting or adding items to an order. Adding functionality to the admin interface is more complex and harder than they expected.

They used generic relations, but they don't perform well. At least not when you get to 50k objects. Comment from the audience: perhaps the latest Django's pre-fetch related might help a lot!

Oh, and SOLR is really cool.

Using an IDE for Django development - Diederik

Using an IDE might be controversial: many people are very happy with VI and EMACS. But he tries to show us that we ought to use an IDE.

An IDE ensures your imports are correct, that you don't have undefined variables, no CSS spelling errors and no unused imports. With hand-written code in a non-IDE-editor you must surely get all those errors.

Diederik uses PyCharm, which has nice Python and Django integration and helps you write better code. It also helps you figuring out Django internals. And it has very handy completion.

Personal note: you can get all that pep8/pyflakes checking with emacs (and probably VI), too. See http://reinout.vanrees.org/weblog/2010/05/11/pep8-pyflakes-emacs.html

Django-counters - Boaz Leskes

Boaz told something about pycounters already in today's django profiling talk and now he showed some screenshots of django-counters' output. Django-counters is the pycounters version for Django.

You can get to know important stuff about production like number of requests and time requests take and so. He wants to turn django-counters into an easily installable app (for pycounters you need to install munin for visualization).

Vision is for itn to be easy to use. View stats in the admin. And... he wants help building it. You can give him help on bitbucket: https://bitbucket.org/bleskes/pycounters .

(Note: my brother also had a blog entry about Boaz' work: http://maurits.vanrees.org/weblog/archive/2011/10/pun#boaz-leskes-pycounters ).

Lightning talks at the Dutch Django meeting (part one)

From Reinout van Rees on April 18, 2012 05:51 PM

Lightning talks at the April 2012 Dutch Django meeting

Code with style: PEP8 and Pylint - Johan Otten

Readability counts and PEP8 gives you a consistent code style. And it is BDFL-approved.

There's a tool for it: http://pypi.python.org/pypi/pep8 .

Pylint gives you static program analysis. There's a lot it checks.

Also look at PEP 257 about docstrings. Read it once or twice, just to get it a bit in the back of your head. It is not that important.

Note: there's also a django lint.

Very useful: there's a Jenkins "violations" plugin. This helps you see the number of pep8/pylint violations. Also there's django-jenkins that helps a lot in setting up Jenkins for Django projects.

(Personal note: I really agree with PEP8, I even helped getting pep8 on pypi (it was a standalone .py file earlier). And I personally prefer pyflakes to pylint. Way easier to run than pylint and you get a lot of the benefits. And if you want to use pep8 and pyflakes in emacs, see http://reinout.vanrees.org/weblog/2010/05/11/pep8-pyflakes-emacs.html ).

Building apps using high-level models - Jeroen Vloothuis

Jeroen worked on a project with a questionaire with lots of pages and questions. Boring code to write, adding all those questions as Django model fields. Lots of model code, lots of view code.

The solution: let's genereate one data structure to rule everything for me. So he defined some Questionaire and Question classes that he could define questionaires with in a simple syntax (one line per question).

He would then use that information to generate the Django models! The same for generationg the forms.

The good thing: it's really DRY (Don't Repeat Yourself) and it is nicely declarative. The bad thing is when you need to do debugging (the code is generated, so you cannot stick a pdb in there somewhere). And you need quite some Django skill to pull it off.

Jeroen has a blog post about this, you can see code examples there.

April 17, 2012

Class Based Views Part 2: ListView and FormView

From GoDjango on April 17, 2012 06:30 PM

The ListView and FormView class based generic views are the first look we have at generic views with some power behind them which can really save us some code. The ListView is great for showing content and paginating said content with very little effort. While the FormView is great for dealing with class based forms without having to deal, to much, with the underlying request itself.
Watch Now...

April 15, 2012

Virtual currency site in testing phase

From Will McGugan on April 15, 2012 08:26 PM

I made currency site available for testing today. See my previous post for the back-story, but in essence currency site is a virtual currency platform.

I sometimes object to the word virtual in ‘virtual currency’. Most of the money I possess is not in any physical form; it's merely a number stored in a database somewhere – and transactions occur without any kind of physical objects changing hands. So the word ‘virtual’ seems entirely redundant, since there's is no qualitative difference between virtual and ‘real’ money I can see. The only difference is the level of trust in the system.

But I digress. Currency site is a platform for virtual currencies, in that it is up to the users to create and manage currencies. The site just provides the tools. What currencies are used for is irrelevant as far as the platform is concerned. It could be for a house of students to manage the housework, or for a community to exchange goods and services. Regardless of what a currency is used for, there has to be a certain amount of trust in the system. The platform has to be reliable, in that you shouldn't be able to create currency without a valid transaction. Currency site is centralised, which makes that requirement simpler–the system keeps track of how much is owned, in the same way you trust banks to keep track of the money in your accounts.

The second level of trust is in the creator of the currency. The creator of the currency has the extra responsibility of defining how much of that currency is available at any one time. This is done by minting new currency. For instance, if the provider creates a currency and mints 1,000,000 virtual bucks then only 1,000,000 will ever be available to other users. It could be owned by a single person, or by a million people owning one virtual buck. Alternatively, since all currencies are divisible by 100, it could be that 100,000,000 people own 0.01 virtual bucks (a virtual cent?). However it is distributed there will be no more than one million virtual bucks in existence unless more is minted. A record of the currency mints is public, as well as information about how much currency exists and how much is in general circulation, which allows regular users to keep an eye on how the currency is managed.

From a techy side, currency site wasn't all that challenging. Sure it was a few months of part time work, but it was mostly user interface code. I wanted to make something that worked like online banking, but not as painful (I've never used an online banking system that didn't make me want to tear hair out). That was helped by using Twitter's bootstrap CSS framework, which creates an elegant user interface with simple markup.

There was only one piece of code that wasn't a straightforward as it appeared (and it was kind of fundamental). A currency site transaction basically involves subtracting a value from the source account and adding it to the destination account. In psuedo code, it is simply this:

if source_account.balance < amount:
    raise TransferError("Not enough currency")
source_account.balance -= amount
destination_account.balance += amount

In essence, that's all that is done, but things get more complex in the context of a web application where multiple transactions may occur simultaneously. For example, if an account A contains 30 virtual bucks and the owner attempts to send 20 virtual bucks to account B and simultaneously sends 20 virtual bucks to account C, one of those transactions has to fail – otherwise we may end up with a negative balances which is not allowed. The if statement checks if the account has enough currency, but if both those transactions occur simultaneously then they will both subtract 20 from the source account (leaving -10). Granted, this could only occurs in a small window of time, but there is no way to recover if it does.

I couldn't figure out how to handle this situation elegantly with the Django ORM, and I don't like resorting to custom SQL. Fortunately, the recent release of Django 1.4 came to the rescue with the addition of select_for_update, which does row level locking. Basically it allowed me to lock the two accounts objects so that no other process is permitted to modify them until the currency has been transferred. Another consideration is that the entire thing has to be done in a (database) transaction, since half a (currency) transaction could result in currency being subtracted from the source account without being added to the destination (in effect, disappearing currency from the system). To keep the currency consistent, the psuedo-code becomes:

begin_transaction()
lock(source_account, destination_account)
if source_account.balance < amount:
    raise TransferError("Not enough currency")
source_account.balance -= amount
destination_account.balance += amount
commit_transaction()

I don't think there is much else in the code that is blog-worthy, although there is still plenty of features I'm thinking of adding. I'm considering allowing users to trade currencies, which might be interesting. I would also like to build an API, so users could pay for web content with virtual currency. A few more evenings of hacking in there I think…

If you would like to help with the testing then head on over to http://currency.willmcgugan.com/ (username: currency, password: reliance). If you let me know, I'll send you 100 beta bucks for your time. Bear in mind its in an early testing phase, so don't use it for anything serious – I'll be wiping the database before it goes live. I'd also be interested in suggestions for a proper domain name!