Mittwoch, 30. April 2008

Using reCAPTCHA with Google App Engine

We have no image creation capabilities in Google App Engine (yet). So if you want to display a CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) within a GAE application, you need to do this by using an external service provider.

One of these providers is reCAPTCHA - and the nice thing about reCAPTCHA is, that your visitors are helping to digitize books by solving the CAPTCHA.

So how do we start?
At first, you should get a service key on recaptcha.net.
After you registered, you can create an unlimited number of service keys for your domain.
Note: If you want to use reCAPTCHA with your dev environment, you need to get keys for localhost.
After generating one pair of keys, the screen should look like this:

Write down your public and private key - we will need them later. If you forget them, you can return to recaptcha.net to find them out. Don't tell anyone your private key.

After having registered for the service, you should get a simple service class I adapted from python.org to GAE. You can download the adapted class here. Rename it to captcha.py and copy it somewhere (I prefer putting it in a subdirectory named recaptcha/client/ - remember to add an empty __init__.py in every directory if you do this) within your App Engine project.

For demonstration, I will be extending the helloworld example from GAE.

Go to your helloworld.py and import the captcha and environ modules:
from os import environ
from recaptcha.client import captcha
Then go to your MainPage controller and add the following lines:
chtml = captcha.displayhtml(
public_key = "YOUR-PUBLIC-KEY",
use_ssl = False,
error = None)

template_values = {
...
'captchahtml': chtml
}
Exchange YOUR-PUBLIC-KEY with your public key - if you don't, you'll get a message:
"Invalid public key. Make sure you copy and pasted it correctly."

Now switch to your HTML template and add the captchahtml output within your form tags:
<form>
...
{{ captchahtml }}
</form>

When you open the site in your browser, you'll see the recaptcha iframe:

Now when the form gets submitted, we need to check if the captcha input was correct. Therefore, we need to change into the post method of our Guestbook object and add some code:

def post(self):
challenge = self.request.get('recaptcha_challenge_field')
response = self.request.get('recaptcha_response_field')
remoteip = environ['REMOTE_ADDR']

cResponse = captcha.submit(
challenge,
response,
"YOUR-PRIVATE-KEY",
remoteip)

if cResponse.is_valid:
# response was valid
# other stuff goes here
else:
error = cResponse.error_code
...

Exchange YOUR-PRIVATE-KEY with your previously generated private key.

The user inputs to the reCAPTCHA iframe will be validated together with the remote IP (your visitor's IP) and the challenge. We get a response from the reCAPTCHA API server and a RecaptchaResponse object will hold the answer.

The RecaptchaResponse object has two properties:

  • is_valid is set to True if the test was successful (otherwise it'l bee False)
  • error_code will hold an API error code if there was a problem.

Note: In the Getting Started Guide for GAE, the form gets submitted to the Guestbook controller. Normally you would submit to the MainPage controller and pass the error code from the RecaptchaResponse object (if there is one) to the displayhtml method of the captcha class:

chtml = captcha.displayhtml(
public_key = "YOUR-PUBLIC-KEY",
use_ssl = False,
error = cResponse.error_code)

That will display a human readable error message to the user:

and lets your visitor redoing the CAPTCHA without losing his/her previously entered values in the other form fields.

Keep in mind: For every submitted CAPTCHA, a request to the reCAPTCHA server is made. This request is synchronous, so the response to your visitor will get delayed by the time it takes to fetch the response. If the reCAPTCHA server can not be reached, the error code recaptcha-not-reachable will be returned.

Montag, 21. April 2008

Using custom django template helpers with Google App Engine

This is dedicated to all you django lovers out there. If you wondered how to use custom filter functions with your Google App Engine applications, here you go:

Create a new file (I created mine in common/templatefilters.py) to hold your template helper functions.

In this file, we need to register our custom filters, so:
# import the webapp module
from google.appengine.ext import webapp
# get registry, we need it to register our filter later.
register = webapp.template.create_template_register()
After we got the registry, we now can define our custom function:
def truncate(value,maxsize,stopper = '...'):
""" truncates a string to a given maximum
size and appends the stopper if needed """
stoplen = len(stopper)
if len(value) > maxsize and maxsize > stoplen:
return value[:(maxsize-stoplen)] + stopper
else:
return value[:maxsize]
Then, we need to register our filter as following:
register.filter(truncate)
Now go to your bootstrap file (the one with your main function - in my case base.py - see image above) and add the library (if you used a different module than common.templatefilters, you need to specifiy it here):
webapp.template.register_template_library(
'common.templatefilters')
After adding this line of code, your template functions will get added to django and are available to use within your html templates:

This will truncate the variable somevar to 20 characters and add "..." to the end, if needed.

Mittwoch, 16. April 2008

ER-Modeling with Google App Engine (updated)

3rd updated version with a lot of input from the GAE group. Thanks to everyone!

ER-Modeling with Google App Engine is somewhat different to "normal" modeling for a relational database.

Here is a small tutorial on how to create well-known relationship models One-to-One (1:1), One-to-Many (1:n), Many-to-Many (m:n) and a special one (Cascading relations) with GAE:

We need the following Model classes for our example. Also if you want to run it, don't forget to import the sys module - we need it for printing to stdout:

import sys #import sys module for printing to stdout

class Car(db.Model):
brand = db.StringProperty(required=True)
wheels = db.ListProperty(db.Key)

class Human(db.Model):
name = db.StringProperty(required=True)
drives = db.ReferenceProperty(reference_class=Car)
spouse = db.SelfReferenceProperty()
owns = db.ListProperty(db.Key)

class Wheel(db.Model):
isBroken = db.BooleanProperty(default=False)
position = db.StringProperty(choices=set(["left_front",
"left_back",
"right_front",
"right_back"]))

One-to-One (1:1)

A simple relationship between two entities.

Let's say a human called Jack drives one car.
We can model this relationship as following:

# one-to-one
jack = Human(name="Jack")
mercedes = Car(brand="Mercedes")
jack.drives = mercedes.put()
jack.put()
print >> sys.stdout, "Jack drives a "+jack.drives.brand

# Jack drives a Mercedes

As you can see, we create a human called "Jack" and a car "Mercedes". After that we assign the car to the drives-property of Jack and save Jack.

Hint (Thanks to Miguel for pointing this out): be careful with 1:1 relationships done with ReferenceProperty-properties - you could easily write something like this:

jack        = Human(name="Jack")
mike = Human(name="Mike")
mercedes = Car(brand="Mercedes")
mercedesid = mercedes.put()

jack.drives = mercedesid
jack.put()

mike.drives = mercedesid
mike.put()
which won't be a 1:1 relation any more. So if you really need to make sure that an entity is only referenced once, you need to do this by your code design (by searching within your existing model kinds and throwing exceptions).
Even using a cascading-relationship with a parent entity (see later in this article) does not make sure, that there is only one car per human and therefore includes the same difficulty just the other way round.

Special: self references

Now a special sort of references are self-references. That means a Model references an entity of the same Model class (e.g. a Human references a Human):
# one-to-one self
bob = Human(name="Bob")
jane = Human(name="Jane")

bob.spouse = jane.put()
bob.put()
b_spouse = Human.get(bob.spouse.key())
print >> sys.stdout, "Bob's spouse is "+b_spouse.name

# Bob's spouse is Jane
We created two humans (Bob and Jane) and set Bob's spouse-property to reference Jane.
Now we can easily find out who Bob's spouse is.

Special: mutual references

But if Bob is married to Jane, Jane is also married to Bob, isn't she?
So let's do this semantically correct:
# one-to-one self mutual
bob = Human(name="Bob")
jane = Human(name="Jane")

bob.spouse = jane.put()
jane.spouse = bob.put()
jane.put()

j_spouse = Human.get(jane.spouse.key())
print >> sys.stdout, "Jane's spouse is "+j_spouse.name
b_spouse = Human.get(bob.spouse.key())
print >> sys.stdout, "Bob's spouse is "+b_spouse.name

# Jane's spouse is Bob
# Bob's spouse is Jane

Be careful to check if reflexive references are semantically allowed or not – in our case it wouldn't be valid if bob's spouse is himself.

Also in a monogamous society, it wouldn't be valid if Bob is the spouse of more than one other Human entity (!)
If you don't want this behavior, you need to prevent it by throwing exceptions. You could use the validator-parameter of every Property class.

One-to-Many (1:n)

Sometimes we need to reference more than one entity from another.
When modeling 1:n relationships, a special of the Model class comes in handy: parent models (or ancestors).
Let's think of a car having four wheels. Those wheels belong exactly to one car, so we define the car as parent for every wheel we create:

# one-to-many using parent
bmw = Car(brand="BMW")
bmw.put()

lf = Wheel(parent=bmw,position="left_front")
lf.put()

lb = Wheel(parent=bmw,position="left_back")
lb.put()

rf = Wheel(parent=bmw,position="right_front")
rf.put()

# uh, snap, the 4th wheel is broken!
rb = Wheel(parent=bmw,position="right_back",isBroken=True)
rb.put()

# from car to wheels
bmwWheels = Wheel.all().ancestor(bmw)
print >> sys.stdout, "The BMW has the wheels: "
for wheel in bmwWheels:
print >> sys.stdout, "- "+wheel.position

# The BMW has the wheels:
# - left_front
# - right_back
# - right_front
# - left_back

# from wheel to car
brokenWheels = Wheel.gql("WHERE isBroken = :broken",
broken=True)
print >> sys.stdout, "The following cars are broken: "
for wheel in brokenWheels:
print >> sys.stdout, "- "+wheel.parent().brand

# The following cars are broken:
# - BMW

We can see: it is easy to get the wheels for a car and it is also possible to get the car a certain wheel belongs to. Also an entity can only have one parent at a time - so we made sure, that a wheel is not used by more than one car at the same time. You could even add some spare wheels to a car (there is no way to define a maximum number an entity can be used as parent entity).

The other way round

It is also possible to do this the other way round. Lets say we have an additional Model like this:

class OwnedCar(db.Model):
brand = db.StringProperty(required=True)
owner = db.ReferenceProperty(Human, required=True)

then we could add cars to a person as following:

paul = Human(name="Paul")
paul.put()

pauls_bmw = OwnedCar(brand="BMW", owner=paul)
pauls_bmw.put()

pauls_mercedes = OwnedCar(brand="Mercedes", owner=paul)
pauls_mercedes.put()

pauls_cars = paul.ownedcar_set
print >> sys.stdout, "Paul's cars: "
for car in pauls_cars:
print >> sys.stdout, "- "+car.brand

# Paul's cars:
# - BMW
# - Mercedes

This makes for example sure, that one car is only owned by one human at a time.

Also notice the part on how to get Paul's cars. You don't even need to create a GQL query - the property modelname_set holds the references. You can find more about this in the docs.

Thanks to Aprigio for his input!

Special: One-to-Many using a list

You can also create 1:n relationships using a list:

# one-to-many using list
dodge = Car(brand="Dodge")
w1 = Wheel(position="left_front")
w2 = Wheel(position="left_back")
w3 = Wheel(position="right_front")
w4 = Wheel(position="right_back")
dodge.wheels = [w1.put(),w2.put(),w3.put(),w4.put()]
dodge.put()

dodgeWheels = Wheel.get(dodge.wheels)
print >> sys.stdout, "The Dodge has the wheels: "
for wheel in dodgeWheels:
print >> sys.stdout, "-"+wheel.position

# The Dodge has the wheels:
# -left_front
# -left_back
# -right_front
# -right_back

but: be careful, this model does not make sure, a wheel does belong to exactly one car! You could reference a wheel from more than one car, which would be semantically wrong in our example. So rather use this model for n:m relationships!

Many-to-Many (m:n)

For creating m:n relationships, use the following model:

# many-to-many using list+db.Key
jack = Human(name="Jack")
bob = Human(name="Bob")

vw = Car(brand="VW")
chevy = Car(brand="Chevy")

carpool = [vw.put(),chevy.put()]

jack.owns = carpool
jack.put()

bob.owns = carpool
bob.put()

chrysler = Car(brand="Chrysler")
jack.owns.append(chrysler.put())
jack.put()

jackOwns = Car.get(jack.owns)
print >> sys.stdout, "Jack owns: "
for car in jackOwns:
print >> sys.stdout, "- "+car.brand

# Jack owns:
# - VW
# - Chevy
# - Chrysler

whoOwnsTheChevy = Human.gql("WHERE owns = :car",car=chevy)
print >> sys.stdout, "These humans own the Chevy: "
for who in whoOwnsTheChevy:
print >> sys.stdout, "- "+who.name

# These humans own the Chevy:
# - Jack
# - Bob

First we create our two guys Jack and Bob. Then they decide on sharing cars. Their carpool consists of a VW and a Chevy. After that, Jack decides on buying an additional car for himself (the Chrysler).

When adding to a list like in this example, there is no check if the value already is present within the list - so it would be possible to reference the same entity multiple times. If you want to prevent this, you need to run a function on the list, which makes sure there are only unique entities:

def unique(lst):
d = {}
for item in lst:
d[item] = 1
return d.keys()

jeep = Car(brand="Jeep")
jack.owns = unique(jack.owns + [jeep.put()])
jack.put()
You could also check to content of the list before adding an entity, if it is already referenced.

Most flexible: using mapping entities

It might be a better idea to implement a different entity to represent the relationship
between a human and the cars they own.
The reason for this is that you could add more fields to this which may be beneficial later in a query. Lets say we add a bought field that contains the date the car was bough. Then one could get cars owned by Jack that he bought after a certain date.

class CarOwner(db.Model):
car = db.Reference(Car, required=True)
owner = db.Reference(Human, required=True)
bought = db.DateProperty(auto_now_add=True)

@staticmethod
def get_owner_cars(human, bought):
"""Returns the cars that the given human
bought before a specified date"""
if not human: return []
query =
db.Query(CarOwner)
query.filter(
'owner =',human)
query
.filter('bought >= ',bought)
return [entry.car for entry in query]

also we'd like to add a function to the Car-Model, so that it looks like this:

class Car(db.Model):
brand = db.StringProperty(required=True)
wheels = db.ListProperty(db.Key)

def human_owns(self, human):
"""Returns true if the given human owns this car."""
if not human: return False
query = db.Query(CarOwner)
query.filter('car =', self)
query.filter('owner =', human)
return query.get()
When we have the Models defined like that, we can go on using them:
max = Human(name="Max")
max.put()
saab = Car(brand="Saab")
saab.put()

ownership = CarOwner(car=saab,owner=max)
ownership.put()

import datetime
delta = datetime.timedelta(days=-1)
yesterday = (datetime.datetime.utcnow() + delta)

maxs_cars = CarOwner.get_owner_cars(max,yesterday)
print >> sys.stdout, "Max's cars since yesterday: "
for car in maxs_cars:
print >> sys.stdout, "- "+car.brand

# Max's cars since yesterday:
# - Saab

max_owns_saab = bool(saab.human_owns(max))
print >> sys.stdout, "Max owns the Saab: "+str(max_owns_saab)

# Max owns the Saab: True

As you can see, the additional property on the mapping entity comes in handy when selecting the cars Max bought since yesterday.
And also the method on the car-Model is pretty neat to bundle functionality related to a model within it.

Note: you can use this sort of referencing also use for 1:1 and 1:n relationships. You just need to make sure, that either one referenced entity (1:n) is unique, or both (the combination of those - for 1:1)

Thanks to Brian, who brought this up!

Cascading relations

With the parent-property, cascading relationships are possible. This is very useful, if you have a folder-like structure (e.g. categories and products of an online shop). Let's have a look on how to do this:
import sys

class Category(db.Model):
name = db.StringProperty(required=True)

class Product(db.Model):
name = db.StringProperty(required=True)
price = db.FloatProperty()
categories = db.ListProperty(db.Key)

root = Category(name="Products")
root.put()
tech = Category(parent=root,name="Tech stuff").put()
books = Category(parent=root,name="Books")
books.put()
fantasy = Category(parent=books,name="Fantasy")
fantasy.put()
scifi = Category(parent=books,name="Science Fiction")
scifi.put()
scomics = Category(parent=scifi,name="SciFi comics")
scomics.put()

somebook = Product(name="Some book")
somebook.categories.append(books.key())
somebook.price = 9.99
somebook.put()

lotr = Product(name="Lord Of The Rings")
lotr.categories.append(fantasy.key())
lotr.price = 29.99
lotr.put()

allFantasyBooks = Product.gql("WHERE categories = :cat",
cat=fantasy)
print >> sys.stdout, "All fantasy books: "
for book in allFantasyBooks:
print >> sys.stdout, "- "+book.name



path = []
p = scomics
while p.parent():
path.append(p.name)
p = p.parent()

path.reverse()
print >> sys.stdout, " > ".join(path)

# Books > Science Fiction > SciFi comics

Now getting all books in a category is pretty easy, huh? And also generating a breadcrumb-like "where am I" is simple by iterating through all parent entities for a given entity (here it is the science fiction comics Category).

Attention: while the use of parent relationships if fine for the use case here, it would not be very efficient in the case where there are a lot of children associated with a parent.
This is because this results in a very large entity group.
The more entity groups your application has - that is, the more root entities there are - the more efficiently the datastore can distribute the entity groups across datastore nodes.
Thus for efficiency you should avoid the case where there are a lot of children.
According to the docs:

"A good rule of thumb for entity groups is that they should be about the size of a
single user's worth of data or smaller."

Thanks to Ben, Michael and Brian for pointing this out.

Donnerstag, 10. April 2008

Google App Engine & eclipse (PyDev)

updated (Link to youtube video) thanks to mano marks.
updated (Python debugging) thanks to mkielar.

updated (CC-by license).

Update: If you like following a video more than reading some text, Mano Marks from Google created a screencast about how to get started with App Engine.

Since Google published their App Engine, I am highly interested in it. Too bad I didn't get an App Engine account, but at least I am able to test it locally...
As a fan of eclipse and being new to Python development, I searched for an eclipse Python extension and finally found PyDev sourceforge project.
I know this is basic stuff, but it took me some minutes to find out how to get code completion working with Google App Engine.
So here is a small HowTo for getting App Engine with eclipse and PyDev up and running:

  1. Get Python and install it.
  2. Get eclipse and unzip it.
  3. Get the Google App Engine SDK and install it.
  4. Open eclipse and go to Help -> Software Updates -> Find and Install.
    • Choose "Search for new features to install" and click on "Next"
    • Click on "New remote site"
    • Select the PyDev update site you just added
    • Click on "Finish"
    • After the update site has been searched, Choose to install the PyDev plugin (see image below)
    • Click on "Next"
    • accept the agreement
    • Click on "Finish"
    • Click on "Install All"
    • Choose "Yes" to restart eclipse
  5. After eclipse has been started
    • choose Windows -> Preferences to bring up the preferences dialog
    • Change to the "PyDev" -> "Interpreter - Python" section to configure Python
    • Click on "New" right to the "Python interpreters" field to add an interpreter
    • Search for the python.exe executable (normally in C:\Python25)
  • After you selected the executable, PyDev will search for libraries, and the following screen comes up:
    • Normally, the preselection done by PyDev is fine, so click "OK" to accept the system pythonpath entries.
    • In the Preferences Window also click "OK" to confirm your changes.
  1. Now we are ready to create our first python project.
    • Do a right-click in the Package Explorer and choose New -> Other.
    • In the upcoming dialog select "Pydev project" from the Pydev folder.
    • Create a new helloworld project with (!important) Python 2.5 (see image below)
    • After clicking "Finish", eclipse should switch to the Pydev view.
    • Do a right click on your new project and choose "Properties"
    • In the properties dialog choose the "PyDev - PYTHONPATH" section to add the App Engine libraries (we need this for proper code assist)
    • Click on "Add source folder" to add the following folders (see image) from your google appengine folder:

    • Click "OK" to confirm your changes.
  2. For a quick start, you can work on the Getting Started tutorial avalibale.
    Use the "src" Folder as base folder for your project (see image to the right)
  3. The last thing we need to do now is to add a user-defined starter for our Google App Engine development server.
    • Got to "Run" -> "Open Run Dialog"
    • Choose "Python Run" and add a new launch configuration (document icon with the plus sign in the top left of the dialog)

    • Name your run configuration
    • As project choose your python GAE project.
    • As main module, enter the location of the "dev_appserver.py" script.
    • Change to the "Arguments" tab and enter "${project_loc}/src" as first argument.

      After this argument, you may add all available additional arguments listed on the Dev Webserver documentation page. (here for example we changed the port where GAE is listening to 9999)
    • Click on "Apply" to save your changes
    • Click on "Run" to run your project
  4. From now on you can run your project by selecting your confuguration from the "Run" dropdown and access your app on http://localhost:9999/.
  5. If an error occurs, you can see it in the console view of eclipse and click on it to jump to the error location within your scripts.




Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported License.

Donnerstag, 3. April 2008

web 2.0 SVG badge

Auf der Suche nach einem web 2.0 badge kam mir folgendes Photoshop Tutorium unter die Finger.
Da ich kein aktuelles Photoshop mein Eigen nenne, und eine Suche nach einem web 2.0 SVG badge bei Google nicht wirklich brauchbares Material zu Tage förderte, nahm ich mir geschwind inkscape und baute das badge aus dem Tutorium nach Augenmaß nach.
Das Ergebnis kann man links begutachten und hier herunterladen.


Creative Commons License
This work is licensed under a Creative Commons Attribution-Share Alike 2.0 Germany License.