Wednesday, May 26, 2010

Download all the mp3 links in a page given the url with ruby and nokogiri

You'll need ruby and the gem nokogiri installed. Then run this


#!/usr/bin/env ruby

require 'nokogiri'
require 'open-uri'

if ARGV.size == 0
  puts "usage: #{File.basename(__FILE__)} 'url' ..."
  exit
end

ARGV.each do |url|
  doc = open(url) {|io| Nokogiri.HTML(io.read) }

  mp3_links = doc.xpath("//a").select {|link| link['href'] =~ /\.mp3$/ }
  mp3_links.each do |link|
    href = link['href']
    outname = File.basename(href)
    puts "Downloading: #{outname}"
    open(href) do |io|
      File.open(outname,'w') {|out| out.print(io.read) }
    end
  end
end

1 comment:

EGP said...

Now that's some useful code!