Been working on a variety of data collection tools, and plan on continuing to do so. It's fun, I'm getting some things done, and it helps exercise the brain. Through all of these exercises:
Starting to realize, the hardest part about collecting data from the web... is not grabbing the data (Nokogiri makes that pretty easy)... it's making the data fit into the 'holes' you want it to fit in. Not referring to int, varchar, etc... not 'an array of hashes' either... that's super easy.. the ASCII/non-ascii character sets are a rather large pain. You have the web/html, Ruby (or Python or Java or Perl or PHP or whatever), MySQL. All have different constraints with text there and represent some characters differently. When collecting foreign and domestic names, as one example, it's especially apparent -- the tildes the oomlats the accents, etc. It's really a set of devilish details depending on the problem you're trying to solve. Yet another reason to master Regular Expressions, not to mention to grok the text representations of each system.
...am starting to believe I need to build an Adapter to bridge the gap between all of these character interfaces. In my Ruby Classes, make sure that inputs all funnel into what the database is expecting that can be represented properly. Web/Ruby/Database -- understand each other the way I (a human) can understand looking at names and interpreting. i.e. scraping Joakim BÄCKSTRÖM should equal Joakim BÄCKSTRÖM whether it's Joakim BackstrÖm or Joakim Backstrom or Joakim BACKSTROM or whatever -- and no matter which version i scrape on any site, they should all be equal, and should all lead back to PLAYERID = 623, for example, so all his data is collated and connected. And it also appears on the output side as one 'blessed' name.
Any thoughts on this issue? Any ideas of a different design pattern I could use besides Adapter?
A web developer, architect, & aspiring RESTafarian's thoughts on software, web tech, entrepreneurial endeavors and some creative ideas. Mark's current focus is on developing elastic & RESTful Ajax applications on the Cloud with the following technologies: OO and unobtrusive JavaScript using the Prototype JS library, jQuery, and on the server side prefers to write OO code in Ruby, Rails; Amazon EC2 AS3 SimpleDB; mySQL; -- currently learning a new language each year and groking Unix
Monday, December 29, 2008
Thursday, December 25, 2008
Merry Christmas, Irish Optimists!
Irish annihilate Hawaii
What a phenomenal game for the Irish yesterday in the Hawaii Bowl. Clausen was perfect, completely accurate 401 yards through 2.5 quarters, 5 td passes. Armando Allen runs back a kick for a TD, Golden with 2 bomb TD catches, and one punt run back (called back for roughing, too bad).

What a phenomenal game for the Irish yesterday in the Hawaii Bowl. Clausen was perfect, completely accurate 401 yards through 2.5 quarters, 5 td passes. Armando Allen runs back a kick for a TD, Golden with 2 bomb TD catches, and one punt run back (called back for roughing, too bad).
The Irish looked dominant on both sides of the ball, and special teams. By far the best game they played all year, let alone the best game on the road. Over a Hawaii team that beat Fresno State and nearly beat Big East champion, Cincinnati.
Enjoy this sequence of pics from Armando Allen's ko runback... check out the block by Tate as he leads Armando through the hole in the wedge... awesome:
wedge develops and Golden starts through the hole...
wedge develops and Golden starts through the hole...

Tuesday, December 23, 2008
Merb and Rails unite in Rails3!
I see this as great news for the Ruby and Rails and Merb communities... what are your thoughts?
Yehuda Katz post is a great rundown:
http://weblog.rubyonrails.org/2008/12/23/merb-gets-merged-into-rails-3/comments/24239#comment-24239
http://yehudakatz.com/2008/12/23/rails-and-merb-merge/
http://rubyonrails.org/merb
Yehuda Katz post is a great rundown:
http://weblog.rubyonrails.org/2008/12/23/merb-gets-merged-into-rails-3/comments/24239#comment-24239
http://yehudakatz.com/2008/12/23/rails-and-merb-merge/
http://rubyonrails.org/merb