So now that we have functions for scraping data from sites, let's imagine we had to do it for a whole bunch of site. In the case of webpagetest.org, simply putting this in any array won't work. Whenever you go to webpagetest.org, you're entering a URL that, when submitted, changes the path of the webpage. So you start on webpagetest.org, but entering in your site changes the URL from "http://www.webpagetest.org" to "http://www.webpagetest.org/result." This means in order to do a simple automation of our test, we need to have something do the web surfing for us. Before that, however, I have to make a confession: if you use this test to do an large amount of testing, as I did, you're not being a good web citizen. Ensure that you only do this for a small batch of URL's (like 100 at the MOST). Ask for an API. Be wise, unlike myself.
Anyways, there is a tool to do our aforementioned task. Its called Selenium-Webdriver and it made me giggle the first time I used it! There are some awesome and surprisingly intuitive methods with this gem. First to start off this program, you'll have to require the gem. Now let's say we wanted to create a real simple demo program that opened up the browser, navigated to google, and search google for a something (I'm picking Star Trek Beyond Trailers). That's relatively simple and we can draw on some skills we learned from Nokogiri.
The only way a computer is going to know what a google search bar is, is through code. For us, that means HTML and CSS id's. To find these, we right click and click inspect elements. Luckily most element inspectors make finding elements in code easy. For the google search bar the HTML ID is "gs_lc0." Now for the code:
Firstly, we call on Selenium-Webdriver and this takes an argument for which browser you'll be using. So far, I've only tried this on Firefox and Chrome. Using Firefox is extremely easy, so I'd recommend that. Next you're going to want to go to google. The ".navigate.to" method does just that, it navigates to a page. Now, we want to do something to the page, namely enter a value into the google search bar and submit it to search. As mentioned earlier, the google search bar id is "gs_lc0." The find_element method finds the element with your two given arguments. Using ID is the most specific as each ID is unique to the item, "gs_lc0" will only be used for the search bar.
So now your computer has identified the search bar, what do we do? The same things we do as human beings, we just don't really think about it! We click on the search bar, type some stuff, and click or press enter. We just have to tell Selenium_Webdriver to do that now. For this, we use ".click" which prompts the search bar with a click. Then we type something out with the "send_keys" method which takes a string value argument. Finally we wrap it up with ."submit," which is our enter key, making Google search for us Star Trek Beyond Trailers. Which surely to be a great movie!
No comments:
Post a Comment