Showing posts with label Ruby. Show all posts
Showing posts with label Ruby. Show all posts

Wednesday, January 28, 2009

Friday, January 16, 2009

'carl_spackler' about to get ORM-ified

...yup, it's time... no more mysql gem... Should have done this earlier, but no better time than the present to implement...

...this weekend...going to convert any [current 'carl_spackler'] database queries that are mysql-specific, into ActiveRecord calls. This way, someone is just one adapter change away from using their database, any db they want that ActiveRecord supports, with Spackler. ...but really, it will make things easy for me to write to my db using the ActiveRecord syntactical sugar. Laziness... a virtue!

ActiveRecord::Base.establish_connection({
:adapter => "sqlite",
:dbfile => "db/mygolfdb.sqlite"
})

Sunday, January 11, 2009

normalizing up 3 part names -- initial stake in ground

...all tests passing... ...collecting all 2008 PGATour data, and more Euro data now...

There are ZERO orphans in the 2008 PGATour data right now. Have collected each and every player's data for 36 tournaments in 2008. Including any other 3 part names.

The Player class is not in its ultimate form, but it is there and it splits names appropriately... still doesn't flatten special wacky characters and I'm not using any Bayesian techniques yet, but takes care of the 3 part names accurately: Jose Maria Olazabal, David Berganio Jr., Davis Love III, etc, etc.... also had to RegEx out of things like "Davis Love III (PB)"... the (PB) indicating the course name.

re = /\(\w{2}\)/
processed = name.gsub(re, "")
...re-scraping about 75 tournaments for PGA and Euro Tour with new names in the next 15 mins... pushed the new code to the carl_spackler GitHub repo .

CARL_SPACKLER::Player class:


Monday, December 29, 2008

scraping data from the web is difficult (but fun!) around the edges

Been working on a variety of data collection tools, and plan on continuing to do so. It's fun, I'm getting some things done, and it helps exercise the brain. Through all of these exercises:

Starting to realize, the hardest part about collecting data from the web... is not grabbing the data (Nokogiri makes that pretty easy)... it's making the data fit into the 'holes' you want it to fit in. Not referring to int, varchar, etc... not 'an array of hashes' either... that's super easy.. the ASCII/non-ascii character sets are a rather large pain. You have the web/html, Ruby (or Python or Java or Perl or PHP or whatever), MySQL. All have different constraints with text there and represent some characters differently. When collecting foreign and domestic names, as one example, it's especially apparent -- the tildes the oomlats the accents, etc. It's really a set of devilish details depending on the problem you're trying to solve. Yet another reason to master Regular Expressions, not to mention to grok the text representations of each system.

...am starting to believe I need to build an Adapter to bridge the gap between all of these character interfaces. In my Ruby Classes, make sure that inputs all funnel into what the database is expecting that can be represented properly. Web/Ruby/Database -- understand each other the way I (a human) can understand looking at names and interpreting. i.e. scraping Joakim BÄCKSTRÖM should equal Joakim BÄCKSTRÖM whether it's Joakim BackstrÖm or Joakim Backstrom or Joakim BACKSTROM or whatever -- and no matter which version i scrape on any site, they should all be equal, and should all lead back to PLAYERID = 623, for example, so all his data is collated and connected. And it also appears on the output side as one 'blessed' name.

Any thoughts on this issue? Any ideas of a different design pattern I could use besides Adapter?

Tuesday, December 23, 2008

Merb and Rails unite in Rails3!

I see this as great news for the Ruby and Rails and Merb communities... what are your thoughts?

Yehuda Katz post is a great rundown:

http://weblog.rubyonrails.org/2008/12/23/merb-gets-merged-into-rails-3/comments/24239#comment-24239

http://yehudakatz.com/2008/12/23/rails-and-merb-merge/

http://rubyonrails.org/merb

Friday, November 21, 2008

carl_spackler : Ruby to collect golf scores from web

Creating a new open source library hosted on Github, named 'carl_spackler'... it collects data on golf scores throughout the web, and produce a normalized form for each collection. Likely the output will be an array of Ostructs for starters.

Similar principle to OGWR, only for weekly tournament scores, not weekly golf rankings. This way anyone can collect both sets of data and use it as they so choose.

So lets say you want to grab all European Tour scores for the week, just dial up carl_spackler, and let it do your wet work for you.

Lots more to come on 'carl_spackler', 'OGWR', and others.


Carl Spackler, Greenskeeper, Bushwood CC

Tuesday, November 04, 2008

software mission statement

Was reading Signal vs Noise today, per the usual routine at lunch. ...they requested that folks describe them, to the layperson:
http://www.37signals.com/svn/posts/1371-describe-37signals-in-20-seconds-or-less

Here was my suggestion:
“We build sensible and usable customized web applications for humans… while contributing to open source software that helps people build, deploy and maintain web applications for themselves or their own business. In summary, we are entrepreneurial software humanitarians!”

That's my own philosophy on software, and the fellas at 37Signals seem to be of that mindset, after experiencing their products, reading their posts and book (GettingReal), listening to DHH, et al speak, etc.. Got me thinking that's a nice mission statement for individuals as well. So there it is.

It's nice to have the folks at 37Signals out there as an inspiration.

Sunday, November 02, 2008

collect Golf's World Rankings on your own with new Ruby module!

...Official Golf World Ranking, OGWR...
wrote some Ruby code in a Module that grabs the official golf world rankings...
use the data as you wish in your own application...
released as open source...
enables anyone to take the active OGWR data and do with it what you wish
(E.g. store to a database each week, hold in your database to related to players on your website, etc).

...it's decent for a start, planning on continuing to make it more versatile and useful...
You can find the code on GitHub at the links above.

Friday, October 31, 2008

installing Nokogiri

Looking into using tenderlove's new library, nokogiri. Looks faster than Hpricot, according to the benchmarks he provides, the syntax also looks really nice.

hmmm.... I installed racc, and frex... getting an error when creating the Make file. I really want to try this out. If I figure it out, or talk to someone who knows, I'll post a reply. For now, here is the error I'm getting.


-bash3.2.17:holtonma:Fri Oct 31 12:09:37>> sudo gem install nokogiri
Building native extensions. This could take a while...
ERROR: Error installing nokogiri:
ERROR: Failed to build gem native extension.

rake RUBYARCHDIR=/opt/local/lib/ruby/gems/1.8/gems/nokogiri-1.0.1/lib RUBYLIBDIR=/opt/local/lib/ruby/gems/1.8/gems/nokogiri-1.0.1/lib
(in /opt/local/lib/ruby/gems/1.8/gems/nokogiri-1.0.1)
checking for xmlParseDoc() in -lxml2... yes
checking for xsltParseStylesheetDoc() in -lxslt... yes
checking for #include
... yes
checking for #include
... yes
checking for racc... yes
checking for frex... yes
creating Makefile
gcc -I. -I. -I/opt/local/lib/ruby/1.8/i686-darwin8.10.1 -I. -I/opt/local/include/libxml2 -I/opt/local/include -fno-common -O2 -fno-common -pipe -fno-common -g -DXP_UNIX -O3 -Wall -Wextra -Wcast-qual -Wwrite-strings -Wconversion -Wmissing-noreturn -Winline -c html_document.c
gcc -I. -I. -I/opt/local/lib/ruby/1.8/i686-darwin8.10.1 -I. -I/opt/local/include/libxml2 -I/opt/local/include -fno-common -O2 -fno-common -pipe -fno-common -g -DXP_UNIX -O3 -Wall -Wextra -Wcast-qual -Wwrite-strings -Wconversion -Wmissing-noreturn -Winline -c html_sax_parser.c
gcc -I. -I. -I/opt/local/lib/ruby/1.8/i686-darwin8.10.1 -I. -I/opt/local/include/libxml2 -I/opt/local/include -fno-common -O2 -fno-common -pipe -fno-common -g -DXP_UNIX -O3 -Wall -Wextra -Wcast-qual -Wwrite-strings -Wconversion -Wmissing-noreturn -Winline -c native.c
gcc -I. -I. -I/opt/local/lib/ruby/1.8/i686-darwin8.10.1 -I. -I/opt/local/include/libxml2 -I/opt/local/include -fno-common -O2 -fno-common -pipe -fno-common -g -DXP_UNIX -O3 -Wall -Wextra -Wcast-qual -Wwrite-strings -Wconversion -Wmissing-noreturn -Winline -c xml_cdata.c
gcc -I. -I. -I/opt/local/lib/ruby/1.8/i686-darwin8.10.1 -I. -I/opt/local/include/libxml2 -I/opt/local/include -fno-common -O2 -fno-common -pipe -fno-common -g -DXP_UNIX -O3 -Wall -Wextra -Wcast-qual -Wwrite-strings -Wconversion -Wmissing-noreturn -Winline -c xml_document.c
xml_document.c: In function ‘substitute_entities_set’:
xml_document.c:117: warning: unused parameter ‘klass’
xml_document.c: In function ‘load_external_subsets_set’:
xml_document.c:129: warning: unused parameter ‘klass’
gcc -I. -I. -I/opt/local/lib/ruby/1.8/i686-darwin8.10.1 -I. -I/opt/local/include/libxml2 -I/opt/local/include -fno-common -O2 -fno-common -pipe -fno-common -g -DXP_UNIX -O3 -Wall -Wextra -Wcast-qual -Wwrite-strings -Wconversion -Wmissing-noreturn -Winline -c xml_dtd.c
gcc -I. -I. -I/opt/local/lib/ruby/1.8/i686-darwin8.10.1 -I. -I/opt/local/include/libxml2 -I/opt/local/include -fno-common -O2 -fno-common -pipe -fno-common -g -DXP_UNIX -O3 -Wall -Wextra -Wcast-qual -Wwrite-strings -Wconversion -Wmissing-noreturn -Winline -c xml_node.c
xml_node.c: In function ‘new’:
xml_node.c:475: warning: unused parameter ‘klass’
xml_node.c: In function ‘new_from_str’:
xml_node.c:494: warning: unused parameter ‘klass’
gcc -I. -I. -I/opt/local/lib/ruby/1.8/i686-darwin8.10.1 -I. -I/opt/local/include/libxml2 -I/opt/local/include -fno-common -O2 -fno-common -pipe -fno-common -g -DXP_UNIX -O3 -Wall -Wextra -Wcast-qual -Wwrite-strings -Wconversion -Wmissing-noreturn -Winline -c xml_node_set.c
xml_node_set.c: In function ‘allocate’:
xml_node_set.c:106: warning: unused parameter ‘klass’
gcc -I. -I. -I/opt/local/lib/ruby/1.8/i686-darwin8.10.1 -I. -I/opt/local/include/libxml2 -I/opt/local/include -fno-common -O2 -fno-common -pipe -fno-common -g -DXP_UNIX -O3 -Wall -Wextra -Wcast-qual -Wwrite-strings -Wconversion -Wmissing-noreturn -Winline -c xml_reader.c
gcc -I. -I. -I/opt/local/lib/ruby/1.8/i686-darwin8.10.1 -I. -I/opt/local/include/libxml2 -I/opt/local/include -fno-common -O2 -fno-common -pipe -fno-common -g -DXP_UNIX -O3 -Wall -Wextra -Wcast-qual -Wwrite-strings -Wconversion -Wmissing-noreturn -Winline -c xml_sax_parser.c
gcc -I. -I. -I/opt/local/lib/ruby/1.8/i686-darwin8.10.1 -I. -I/opt/local/include/libxml2 -I/opt/local/include -fno-common -O2 -fno-common -pipe -fno-common -g -DXP_UNIX -O3 -Wall -Wextra -Wcast-qual -Wwrite-strings -Wconversion -Wmissing-noreturn -Winline -c xml_syntax_error.c
xml_syntax_error.c: In function ‘Nokogiri_error_handler’:
xml_syntax_error.c:162: warning: unused parameter ‘ctx’
gcc -I. -I. -I/opt/local/lib/ruby/1.8/i686-darwin8.10.1 -I. -I/opt/local/include/libxml2 -I/opt/local/include -fno-common -O2 -fno-common -pipe -fno-common -g -DXP_UNIX -O3 -Wall -Wextra -Wcast-qual -Wwrite-strings -Wconversion -Wmissing-noreturn -Winline -c xml_text.c
gcc -I. -I. -I/opt/local/lib/ruby/1.8/i686-darwin8.10.1 -I. -I/opt/local/include/libxml2 -I/opt/local/include -fno-common -O2 -fno-common -pipe -fno-common -g -DXP_UNIX -O3 -Wall -Wextra -Wcast-qual -Wwrite-strings -Wconversion -Wmissing-noreturn -Winline -c xml_xpath.c
gcc -I. -I. -I/opt/local/lib/ruby/1.8/i686-darwin8.10.1 -I. -I/opt/local/include/libxml2 -I/opt/local/include -fno-common -O2 -fno-common -pipe -fno-common -g -DXP_UNIX -O3 -Wall -Wextra -Wcast-qual -Wwrite-strings -Wconversion -Wmissing-noreturn -Winline -c xml_xpath_context.c
gcc -I. -I. -I/opt/local/lib/ruby/1.8/i686-darwin8.10.1 -I. -I/opt/local/include/libxml2 -I/opt/local/include -fno-common -O2 -fno-common -pipe -fno-common -g -DXP_UNIX -O3 -Wall -Wextra -Wcast-qual -Wwrite-strings -Wconversion -Wmissing-noreturn -Winline -c xslt_stylesheet.c
cc -dynamic -bundle -undefined suppress -flat_namespace -L/opt/local/lib -L"/opt/local/lib" -o native.bundle html_document.o html_sax_parser.o native.o xml_cdata.o xml_document.o xml_dtd.o xml_node.o xml_node_set.o xml_reader.o xml_sax_parser.o xml_syntax_error.o xml_text.o xml_xpath.o xml_xpath_context.o xslt_stylesheet.o -lruby -lxslt -lxml2 -lpthread -ldl -lobjc
ld: warning, duplicate dylib /opt/local/lib/libxml2.2.dylib
/opt/local/lib/ruby/gems/1.8/gems/hpricot-0.6.161/lib/hpricot/builder.rb:26: warning: `&' interpreted as argument prefix
Loaded suite -e
Started
..............................................................................................................................................................................................................................................
Finished in 2.551585 seconds.

238 tests, 701 assertions, 0 failures, 0 errors
/opt/local/lib/ruby/1.8/test/unit.rb:278: [BUG] Bus Error
ruby 1.8.6 (2007-03-13) [i686-darwin8.10.1]

rake aborted!
Command failed with status (): [/opt/local/bin/ruby -w -Ilib:ext:bin:test ...]

(See full trace by running task with --trace)


Gem files will remain installed in /opt/local/lib/ruby/gems/1.8/gems/nokogiri-1.0.1 for inspection.
Results logged to /opt/local/lib/ruby/gems/1.8/gems/nokogiri-1.0.1/gem_make.out

Wednesday, October 29, 2008

Open Source OGWR Scraper

Going to write some Ruby code to scrape the Official Golf World Ranking... going to release it as open source.

Tuesday, October 28, 2008

updated Rails to 2.1.2 along with various rubygems update

Info on Rails 2.1.2: http://weblog.rubyonrails.org/2008/10/23/rails-2-1-2-security-other-fixes

-bash3.2.17:holts:Tue Oct 28 09:23:10 ~ >> sudo gem update
Password: ********************

Updating installed gems
Updating ZenTest
Successfully installed ZenTest-3.11.0
Updating actionmailer
Successfully installed activesupport-2.1.2
Successfully installed actionpack-2.1.2
Successfully installed actionmailer-2.1.2
Updating activerecord
Successfully installed activerecord-2.1.2
Updating activeresource
Successfully installed activeresource-2.1.2
Updating hoe
Successfully installed rubyforge-1.0.1
Successfully installed hoe-1.8.2
Updating rails
Successfully installed rails-2.1.2
Updating rspec
Successfully installed rspec-1.1.11
Updating rubygems-update
Successfully installed rubygems-update-1.3.1

Ezra Zygmuntowicz Tech Talk : Ruby, Merb, Rubinius

This talk is great. Tons of information if you're interested in Ruby, Merb, Rubinius, etc.

"Engine Yard co-founder Ezra Zygmuntowicz gave a Tech Talk on Monday at Google. He covered some of the open-source projects we’re working on at Engine Yard, including Merb and Rubinius."

http://blog.engineyard.com/2008/10/24/ezra-gives-google-tech-talk-on-merb-and-rubinius

Monday, August 11, 2008

Dynamically typed languages are meritocracies

From Russ Olsen's "Design Patterns in Ruby":

"Statically typed languages are constantly asking about your parent or grandparent, or perhaps, in the case of Java-style interfaces, your aunts and uncles. In a statically typed language, an object's family tree matters deeply. Dynamically typed languages, by contrast, are meritocracies: They are concerned with which methods you have, rather than where those methods came from. Dynamically typed languages rarely ask about an object's ancestry; instead they simply say, "I don't care who you are related to, Mac. All I want to know is what you can do."


I love a good meritocracy.

Wednesday, July 30, 2008

DHH at "Startup School"

<div><a href="http://www.omnisio.com">Share and annotate your videos</a> with Omnisio!</div>

Thursday, July 24, 2008

Metaclasses and Ruby Inheritance Chain

This is a very large and involved topic. (great topic! it is part of what makes Ruby so powerful)
You can add new methods to a Class on the fly at anytime. Can also overwrite methods at any time. And, in Ruby, this truly means virtually any Class. Including classes like Array, String, etc.

Let's say we have

> class Course
> attr_accessor :id, :name
> def initialize( id, name)
> @id, @name= id, name

> end
> end

> c = Course.new( 1, "Augusta National" )
=> #<
Course:0x81cfb94 @id=1, @name="Augusta National"...

> c.class
=> Course

> c.id
=> 1
> c.name
=> "Augusta National"


So in this case, a Course object, once initialized, will have a
@id and a @course variable. It can hold
any other variables at any time too...

> c.instance_variable_set( "@par", 72 )
=> 72
> c.par
=> MoMethodError: undefined method 'par' for #<Course..

It's holding that variable for par, but now we cannot get at it without a read method. So let's say you
wanted to only have the reader/writer methods for par on that instance,
not on the class
(just an example).


Classes hold methods, objects do not. Except in the case of metaclasses:
> class << c
> attr_accessor :par
> def pretty_par
> puts "par: #{@par}"
> end
> end

Now you can do
> c.par
=> 72
Also when you do:
> c.methods
=> ["par", "id", "name", "pretty_par", ...] #=> these are referred to as the
#=> object's "Singleton methods"

but when you show the methods for the Course class:
> Course.methods
=> ["id", "name", ...] # i.e. no par, or pretty_par, because there is no method
# for these in the Course class, only the c metaclass
likewise:

> c2 = Course.new(2, "Shinnecock Hills")
=>
#<Course:0x91cfb94 @id=2, @name="Shinnecock Hills"

but it won't have access to any of the Singleton methods in c (i.e. neither "c2.par", nor "c2.pretty_par" method
calls will work. They only work in the c metaclass).


You see, because when Ruby looks for a method, it first looks at the metaclass.
i.e. the metaclass intercepts the request, and checks for methods there first.

Then it goes to the object's class, then the SuperClass of that object, then any super's of that object and so on
...all the way back to the Object base class.


You can insert methods into the class, or into an object's metaclass at any level in the inheritance chain.
Very powerful.



One more beauty... if you attach a:
...
def method_missing
puts "there was no method that matched that"
end
...
you can intercept the call to a missing method (anywhere in the inheritance chain, the class, the metaclass, etc)
and do things like creating methods on the fly, programmatically.

It's called "monkey patching" (some people call it "Duck punching", and there are other names).
Ruby is a great language to metaprogram with. Rails uses method_missing often (along with const_missing) --
for instance, ActiveRecord uses method_missing to dynamically support the find_ methods.

That's a whole other interesting topic in itself to explore!

Friday, June 27, 2008

Powerset : Ruby Front End

Great insights on Ruby's front end from Kevin Clark's blog:

http://glu.ttono.us/articles/2007/06/21/powerset-to-launch-front-end-on-ruby

"The simple fact is that Ruby wasn’t the source of Twitter’s woes. As it often happens with rapidly growing sites, they ran into architectural problems. Some design decisions don’t hurt until they reach a massive scale and at that point you have to rethink your approach. In an email he writes:

For us, it’s really about scaling horizontally - to that end, Rails and Ruby haven’t been stumbling blocks, compared to any other language or framework. The performance boosts associated with a “faster” language would give us a 10-20% improvement, but thanks to architectural changes that Ruby and Rails happily accommodated, Twitter is 10000% faster than it was in January

This is great news for Twitter, but even better for us because we don’t have the bottle necks that they’ve struggled with – databases, instant messaging servers, and regularly recycling cache systems – which makes scaling horizontally much much smoother. At that point, our scaling issue doesn’t concern Ruby. For a search engine, the front-end is largely just a templating system and the real work happens in the back when we process your query."

Thursday, June 26, 2008

Microsoft buying Powerset?

...Rumor has it, that Powerset is being bought by MSFT... that can only mean they are trying to go semantic in their search quest... hopefully the great technical and semantic momentum with Powerset will carry on.

...Powerset is written in all OpenSource -- Ruby, Merb, Rails, god, Mongrel, Mootools, Memcache, Erlang, Fuzed*, YAWS, Hadoop... one of their developers is Kevin Clark, active Ruby/Rails developer: http://glu.ttono.us/articles/2008/05/12/holy-god-powerset-launches

More on Powerset:
http://searchengineland.com/080512-000100.php

had written about Powerset and Kevin Clark 9 days ago...
http://holtsblog.blogspot.com/2008/06/businessweek-article-on-search.html

Interesting to see how this plays out.

Tuesday, June 17, 2008

BusinessWeek article on Search

Interesting points about the gap between what we have now, and where much of the potential in Search remains, especially when it comes to the Semantic web:

http://www.businessweek.com/technology/content/jun2008/tc20080616_034849.htm

...Powerset is written in all OpenSource -- Ruby, Merb, Rails, god, Mongrel, Mootools, Memcache, Erlang, Fuzed*, YAWS, Hadoop... one of their developers is Kevin Clark, active Ruby/Rails developer: http://glu.ttono.us/articles/2008/05/12/holy-god-powerset-launches

More on Powerset:
http://searchengineland.com/080512-000100.php

Great stuff. Thanks to Kevin for sharing that info on his blog -- inspiring work.