Before I forget, I thought I should quickly jot down what I learnt in the process, with the hope that this will be useful to others (or, perhaps, my future self).
Selenium Roses by bcostin http://www.flickr.com/photos/bcostin/155246852/ |
The Selenium website succinctly describes the software: "Selenium automates browsers". Primarily known as a web site testing tool, the seleniumhq site points out that "[b]oring web-based administration tasks can (and should!) also be automated as well".
It turns out that Selenium is, in fact, a suite of tools. There's Selenium IDE, a Firefox plugin that is ideal for quickly prototyping an automation script. And there are various server side components with the core being the WebDriver API. (The WebDriver is also known as "Selenium 2", as it is a merger of the original Selenium with a project started by a tester at Google. See the Selenium project history for more). The WebDriver lets you control various browsers - Internet Explorer, Chrome, Opera, HtmlUnit. It even has ways to simulate iPhone and Android devices. But the browser best-supported by Selenium is Firefox. I had to download a fresh copy, since I've almost entirely switched to Chrome these days. But it was worth it.
Firefox magnet (wallpaper) by flod http://www.flickr.com/photos/flod/2568092124/ |
How I Did It
I wound up basing my script very much on the example in the Getting Started section. Along the way, I developed a few patterns that I'd like to pass along, in the hopes that they will help.
As I mentioned earlier, the web application I was trying to automate turned out to be quite unwebfriendly. What I mean is that it used all kinds of complicated tricks to ensure that, for example, the URL doesn't change as you navigate to different pages within the application. Similarly, rather than using anchor tags (<a href>) for links, it used Javascript "onclick" scripts to dynamically construct them. The good news is that Python-connected-to-Selenium-powering-Firefox is perfect to automate this kind of application, as it fully supports Javascript, frames, pop up windows and the like.
Waiting by amchu http://www.flickr.com/photos/amchu/5261511319/ |
One thing I quickly figured out was that my script needed to wait for the complicated, multi-part pages to finish loading before going onto the next step. The documentation explains how to use either explicit or implicit waiting in Selenium. At first, I followed the lead of the example and looked for a particular web page title. Or I thought I could detect changes using the URL. Except that my web application doesn't change title or URL as the pages change!
However, I hit upon a solution: in each step, my script searches for a particular element by name or id. So, I realized that waiting for that element to be appear was the most effective strategy. This lead me to code like this:
try:
# we have to wait for the page to load
WebDriverWait(driver, 10).until(lambda driver : driver.find_element_by_name("login"))
except:
driver.quit()
Use the Source
I also figured out a variation on the try/wait pattern to help me figure out the next step. Often, I wrote a WebDriverWait function that would time out, without succeeding. Eventually, I realized that simply quitting the driver wasn't the most useful way to debug things. Instead, I would use the page_source property to help me see what was really loaded into the browser at that point:
try:
WebDriverWait(driver, 10).until(lambda driver : driver.find_element_by_name("mainfs"))
except:
print driver.page_source
I also found that it was useful to try to figure out how to navigate the web application using the Selenium-IDE plugin. I would record my actions in the IDE, then review the script it generated, to get clues as to how to write my WebDriver script.
I say "clues", because the IDE generates Selenium 1.0 commands, not the 2.0 syntax I needed to use. Generally, it gave me enough information that, with some searching of the API documentation, I could figure out the right equivalents. However, in certain cases, there are capabilities in Selenium 1.0 that have no direct equivalent in the 2.0.
Grandpa by conowithonen http://www.flickr.com/photos/cmogle/2907198746/ |
One example: in 1.0, there's a way to do relative navigation between frames. It turned out I needed this, since at one point in the web application I was trying to automate, it leaves the focus inside a subframe. Experimenting with the IDE, I saw that it wriggled out of this problem by executing
selectFrame(relative=up)
The problem is that there is no equivalent relative move between frames in Selenium 2. Eventually, I figured out that I needed to switch the focus to the entire window (by making use of the "current_window_handle" property), which then let me select the particular subframe I was looking for
driver.switch_to_window(driver.current_window_handle)
driver.switch_to_frame("body")
Don't Give Up!
Once I figured out these patterns of working by experimenting with the IDE, and printing the source whenever I got stuck, I found that automating the web application with Selenium was fairly straightforward. It would be nice if the documentation was a bit fuller. And it would be wonderful if the IDE generated Selenium 2 commands. But I think that my small investment in figuring it all out was worthwhile. So, now I'm looking for more things to try with Selenium. For example, could I use it to try out RESTful testing of APIs or Linked Data? Are there other web-based chores that I could (even partially) automate?