Webdriver is a way of driving the browser and only does so for a handful of browsers. We can see that it works for handful of browsers. Why do we do it like this? WebDriver tightly couples to the browser using the technology that best fits. WebDriver is a developer focused API so is extremly object orientated. So if we look at Internet Explorer it accesses the browser via the COM layer using some automation hooks that Microsoft have put there and maintain. The core code is written in C++. FirefoxDriver is a Firefox addon that accesses items at the Chrome layer ChromeDriver is a Chrome Extension that allows drive Chrome. The Android driver is an APK that allows us to drive the Web View and the same with iPhone This means that since we are tightly bound to the browser we don't have to use a server in the middle. WebDriver Scales up and down as we see fit! THe other benefit is It means that on browsers or devices we can send more native keystrokes. WedDriver also tries to limit the interaction the test can do by only allowing the test to interact with elements that a user would. An example of this is if you try click on an element that has a display:none in the style Webdriver will throw an exception saying the element isnt visible.
All of the drivers have a common interface that all languages can speak to. The WebDriver project uses a wire protocol that allows you to speak to it via REST. So when we start each browser that we support there is a webserver within that we then speak to. This means that if we want to try get support for new browsers they just need to implement the API and we can then use it from the RemoteWebDriver. This simplifies the way that the drivers communicate which means that anyone looking at the code can try help out where need be.
WebDriver is designed by developers for developers. In Selenium you would get the Selenium object and then work with that one object call what you need. What you need could be anything from clicking to typing and was always against 1 object. WebDriver on the other hand follows a number of good object orientated design principles. We have a driver that starts the browser and gives us ways to find what we need on the page and returns another object representing the item in the DOM. We then use that object to do what we need. So if we take a simple form we just tell it to find the textbox. We then tell the object representing the textbox that we want to send it some key strokes and there we have it.
Selenium 2 is a lot faster than Selenium 1. I have seen speed up of at least 3 times on a number of projects Removing the need for a man-in-the-middle helps with the speed improvements. Since we are more closely bound to the browser we can take better advantage of how it does things. For Example, if we know that the browser can search for elements quicker than a library can we try do that. This means if you supply a CSS Selector and we know the browser can use it, we let the browser return the element instead of going through Sizzle. This can be increased when we don't have to rely on the Selenium server to act as an middle man for your tests. All Languages supported by the project also have access to HTMLUnit which is a headless browser by using the RemoteWebDriver
Since Selenium 2 has the ability to bind to the browser without the need for a server or we can add one when we see fit allows the new Selenium implementation to scale up and scale down. One of the selenium commiters has been experimenting at trying to get 36 different sessions running on a server. Being able to control that amount of browsers without needing to manage a full blow grid
So we kinda of saw this eariler but let us have a better look at it. So I just said that we use the driver to find an object on the page. We then execute and interact with them **REPL driver = webdriver.Chrome() textbox = driver.find_element_by_id(&quot;id&quot;) textbox.send_keys(&quot;let's type something&quot;) **REPL Having the code work this way is more how we think about code when we do normal OO development
Moving from Selenium 1 to 2 is a simple as creating a WebDriver object and injecting that into Selenium and using the exact same Selenium API that we have grown to love. It shows that with a 2 line change to your tests we can suddenly be using the new Se2 code. We have taken a lot of care in trying to make sure that we can be 100% backwards compatible. This will hopefully give people a really easy upgrade path.
What about Selenium Grid? A number of companies have invested heavily in running their tests in Parallel using Selenium Grid? What is going to happen to Selenium Grid with the changes. The OSS has been to fortunate to have had a donation from eBay. They have donated a new version of the Grid, which we are calling Grid 2, It can cope with both the Selenium 1 and Selenium 2 API's making it fully compatible with all that we need.
We all know that having working mobile sites is what is needed as more people start using Android phones or Tablets. Or if they are using iPhones, iPods or even iPads to view your site. Mobile versions of sites are becoming extremely important. Selenium 2 has good support for Android and iOS devices. The selenium project has servers that are installable onto these devices that allow us to test web applications. Let us see an example of this
The future is starting to look like the browsers will give you access to Selenium.