Code Snippet: Ruby Image Scraper

Posted by Ryan Baxter Tue, 14 Aug 2007 03:46:00 GMT

I stumbled upon a screen scraping library for Ruby last week called scrAPI. It’s extremely flexible and can be seen in action on the co.mments blog post scraper. The scrAPI library can be installed by issuing the following command from your console:

gem install scrapi

Testing scrAPI was fairly easy once I figured out how to define a scraper. With that aside, I wrote a small script that saves images from a URL provided by the user. The scrAPI library could be used for good or evil, but only you can decide.

#!/usr/bin/ruby

require 'fileutils'
require 'open-uri'
require 'pathname'
require 'rubygems'
require 'scrapi'

# Get the URL input.
puts 'Enter a URL:'
url = gets.chomp

# Get the HTML source.
html = nil
open(url) {|source| html = source.read()}

# Define the scraper.
scraper = Scraper.define do
  array :images
  process "img", :images => "@src"
  result :images
end

# Scrape the HTML for images.
images = scraper.scrape(html)

# Create a directory to save the images in.
directory = url.gsub(/http:\/\//, '')
FileUtils.mkdir directory

images.each do |image_path|
  # Determine if image_path is absolute or relative. 
  path = Pathname.new(image_path)  
  if not path.relative? then image_path = url + image_path end

  # Write the image to disk.
  open(image_path) do |source|
    file_name = image_path.split('/').last
    open(directory + '/' + file_name, 'wb') {|file| file.write(source.read())}
  end
end

puts 'Finished...'

I Want My IDE

Posted by Ryan Baxter Fri, 10 Aug 2007 02:32:00 GMT

The majority of my academic and professional programming career has been spent writing code using an integrated development environment (IDE). I’ve dabbled with Eclipse, Microsoft Visual Studio, Macrodobe Dreamweaver, and various Borland products. Rather than juggle multiple text editors and source control consoles, I find it easier to stay organized using an IDE on large projects. When editing config files or writing scripts I prefer a lightweight text editor. In Linux, vi or gedit is my choice. Notepad2 is at the top of my Windows list.

A few months ago I decided to learn Ruby and the Ruby on Rails framework. I began with the obligatory Hello World program and quickly progressed through a series of tutorials using vi and gedit to get the job done. Since then, I’ve begun some larger projects and am quickly finding myself losing focus and missing the benefits of an IDE. Consulting Google, I compiled a list of prospective IDEs to begin my evaluation. I’m willing to give each of them a fair chance at becoming my Rails development environment, but have a few questions before I begin. What, if any, IDEs have I missed? How long should I try each one?

Needs (in order of importance):

  1. Linux compatible
  2. Project Browsing
  3. SVN integration
  4. Syntax Highlighting
  5. Code Completion
  6. Active Community
  7. Unit Testing
  8. Debugging
  9. Auto-indent
  10. Plugin support
  11. Less than $100

The list:
*Each IDE/editor was capable of Project Browsing, Syntax Highlighting, and compatible with Linux.

  1. Aptana RadRails
    Pros
    • Good SVN integration.
    • The latest Beta has working Code Completion.
    • Built on Eclipse.
    • More Rails features than Eclipse + DLTK.
    • Many plugins inherited from Eclipse
    • Free.
    Cons
    • Code Completion is broken in the current stable release.
    • Built on Eclipse.
  2. Eclipse + DLTK
    Pros
    • SVN integration.
    • DLTK has Code Completion.
    • Tried and true.
    • Vast library of plugins.
    • Active community.
    • Free.
    Cons
    • Eclipse is slow and consumes a lot of memory.
  3. FreeRIDE
    Pros
    • Auto-indenting.
    • Debugging.
    • Free.
    Cons
    • No SVN integration.
    • No Code Completion.
    • Performance could be an issue b/c it’s a native Ruby application.
  4. gedit + plugins
    Pros
    • Lightweight.
    • Plugins.
    • Free.
    Cons
    • No SVN integration.
    • No Code Completion.
  5. jEdit
    Pros
    • SVN integration.
    • Code Completion.
    • Plugins.
    • Free.
    Cons
    • Not user friendly.
  6. IntelliJ IDEA 6.0
    Pros
    • SVN integration.
    • Code Completion.
    • Debugging.
    • Unit Testing.
    • Plugins.
    • Much more…
    Cons
    • $249.
  7. Komodo IDE 4.1
    Pros
    • SVN integration.
    • Code Completion.
    • Debugging.
    • Built specifically for Ruby on Rails.
    • Much more…
    Cons
    • $295
  8. Mondrian Ruby IDE
    Pros
    • Lightweight.
    • Free.
    Cons
    • No SVN integration.
    • No Code Completion.
    • Performance could be an issue b/c it’s a native Ruby application.
    • Spam in support forum.
  9. NetBeans Beta 6.0 Milestone 10+
    Pros
    • SVN integration.
    • Code Completion.
    • Debugging.
    • Plugins.
    • Free.
    • Much more…
    Cons
    • Beta.
  10. Ruby IDE from CodeGear
    Pros
    • CodeGear experience.
    Cons
    • Feature set not yet released.

I’ll be evaluating each of the IDEs/editors in turn and publishing my results as a series. Feel free to leave feedback and check back soon!

Expect the Unexpected: THIS IS WAL-MART!

Posted by Ryan Baxter Tue, 07 Aug 2007 00:57:00 GMT

I purchased the Two-Disc Special Edition DVD of 300 from Wal-Mart this evening. Attached to the cover was a sticker that read “DOWNLOAD THIS MOVIE” and in small print, “Must reside in U.S. Windows Media Compatible Only. Not compatible with iPods.” Visiting walmart.com/300 in Firefox yielded the following:



It’s 2007. Why?

Older posts: 1 ... 24 25 26 27 28 ... 30