Making a Little Twitter App

For a long time I have used a WordPress plugin called Twitter Tools to create a weekly post of tweets. It was pretty neat. I was tweeting web sites so I could use my blog to bookmark sites. (See more pages about it here and here.)

I also use it on a few other sites I help run, and recently Twitter Tools “upgraded” from 2.4 to 3.0. And it seems to have stopped working. There are no more weekly digest posts.  I am not the only person who is having problems with it. There is a plugin called Twitter Digest that I might look at.

Right now I am writing something in Ruby that will get tweets and format them in HTML. So I can copy and paste my tweets into my blog. (I don’t think there are APIs for WordPress, but I could be wrong.) I will put it on Github. I guess it’s redundant, but I am doing it as a bit of an exercise.

Working on a Mongo Project

These days I am getting used to my new job and driving around all the time in Austin.

I am also working on something on my own time. I have a project on Github called dividend_champions. I am parsing some speadsheets in Ruby and uploading the results into Mongo using Mongoid. The spreadsheets are from the Dividend Champions series from the DRIP Investing Resource Center. They have data about companies that have raised their dividends for several years.

I am using Mongo because from 2008 to 2012 fields have been added. I figured instead of constantly making changes to the schema, it would be best to use something schema-less. When I am done I will work on some queries, both in the mongo shell and with Mongoid.

NMatrix Compilation Instructions

Here are some compilation and installation instructions for Ubuntu that I wrote for the NMatrix gem which is part of the SciRuby project.

Make a working directory for this project: /home/$USER/ruby/sciruby

ATLAS needs CPU throttling to be disabled. On Ubuntu, install the package indicator-cpufreq. It will be an applet on the menu bar. Select “performance”.

Download the LAPACK tar file from http://www.netlib.org/lapack. Save it and unzip, but do not untar it. You do not actually need to compile LAPACK to compile ATLAS, but you do need the tar file.

To get ATLAS, go to http://math-atlas.sourceforge.net/ and bunzip2 the file in the working directory.

For some reason, you cannot run “configure” in the untarred directory. Create a new directory /home/$USER/ruby/sciruby/outputatlas.
Run the ATLAS configure command from there:

Then run “make” It takes about 20 minutes.

This will create a few *.a files. As root, make a directory /usr/local/atlas, copy the *.a files there, and create links as .so files

Go to https://github.com/SciRuby/nmatrix, get the github repo, and follow the directions.

When you run the command “gem install pkg/nmatrix-0.0.2.gem”, it will say:

It does take a while, and there is not a lot of feedback to the screen.

After that completes, run “rspec” or “rake spec”.

Notes:
When I ran rspec, I got some errors, but I posted them to the list, and they are all expected and/or being looked at.

The SciPy page has a section for Ubuntu. It says that I should go into the directory I created to build ATLAS, then go into lib, and run “make shared”. This would be /home/$USER/ruby/sciruby/outputatlas/lib This will create the .so files. I tried this, and I was not able to get it to work. I am googling to find out the answer, but so far no success. But by creating links to the .a files I got it to work. I will look into it, and hopefully get some feedback from the SciRuby list, but so far this seems to work.


More notes/commentary (will probably not appear on official SciRuby site):

It depends on some math libraries written in C and Fortran. The Linux installation instructions on the SciRuby site tell people to go to the SciPy site. I think this is bad for a couple of reasons. First off, the SciRuby project is a competitor to SciPy. It is not as advanced or ready for prime time, but if it really wants to compete it will need to handle things on its own. Plus I think some of the instructions on the SciPy site are out of date.

Ruby is a high-level language, so it is a bit odd to be dealing with C and Fortran. I like to deal with high-level languages to avoid having to deal with low-level details. But if you want to do some serious math, C and Fortran are part of the deal. SciPy is really just a wrapper around the C and Fortran libraries.

 

Update on Chicago Ruby Testing Group

The Chicago Ruby Testing Group is currently on hold. We are revising the curriculum. We will also probably only do it once a week instead of twice a week.

Right now I am working on a short presentation on Cucumber. A big chunk of it will cover why will not be spending a lot of time on Cucumber. A big reason is the network effect: It seems like a beast, yet it does not have a lot of traction. But people might encounter it, so I think we should cover it.

More Well Grounded Rubyist

Recently I got through Chapters 10, 11 and 12 in The Well-Grounded Rubyist.

11 was about regular expressions. I found out about the MatchData class, which I really like.

10 was about Enumerable and Enumerator. It is a chapter I will probably look at again many times in the future. There was a tweet from Evan Light in which he stated that after 6 years of Ruby, he now realizes that a lot of his problems can be solved by a careful reading of the doc page for Enumerable.

So this is why ActiveRecord has find_by_SOMEFIELD (which returns first one) and find_all_by_SOMEFIELD (which returns all)
find_all is also called select

Enumerable.grep can filter/grep by content (as grep on command line does) or it can filter/grep by type. Pretty nice.

Look up “case equality” ===

group by is pretty funky

Another thing I figured out is how to include one of my own Ruby files in pry or IRB:

And then in pry:

Enumerable.inject is also known as reduce (as in MapReduce) or “fold” in some functional languages

In the block, n is the next element in the array. It seems that usually the first element in the block is the array element. It is called “reduce” because you are reducing the array to one value based on a repeated, cumulative operation. I think that “consolidate” might be a better name. It might be a bit verbose, but it would be clearer.

Working With Nokogiri

I recently worked a bit with Nokogiri to parse some XML. I decided to parse the XML behind the map for the Craftsmanship Manifesto.  The map is here  and the XML behind the map can be found here. I put this on Github, and you can find it here.

I tried parsing the XML the textbook way. bin/run.first.parser.sh calls lib/first_parser.rb. It’s a mess. It seems like you have to call the Element and the Text classes to get an element. At least that is what I remember and what I can gather from the code. I have a lot of comments in there. I always have comments in code that is just for exploration. But it seems like I have to call two classes to get one element. Just wrong.

I then looked into using XPath. bin/run.show.parser.sh calls lib/show_parser.rb, which is the example on the Nokogiri site. I was able to parse it with bin/run.first.path.parser.sh, which calls lib/first_path_parser.rb and bin/run.path.parser.sh which calls lib/path_parser.rb I had a problem with namespaces. I first tried  doc.remove_namespaces! but I did not like the idea of disabling namespaces. There was no namespace for the document, so I just prepended “xmlns:” to all the element names and I got it to work.

Eventually I decided to try JSON. I found out about a gem called crack which can convert an XML document to JSON. It is in bin/run.crack.is.whack.sh, which calls lib/crack_is_whack.rb. I was able to try it out with bin/run.first.json.attempt.sh, which calls lib/first_json_attempt.rb. It parses a small version of the file. The whole map is parsed and output to csv with bin/run.json.parser.sh, which calls lib/json_parser.rb

JSON is a lot easier than XML.

Notes On Creating Test Data With FactoryGirl

A while ago Steve Klabnik posted about creating objects for tests via FactoryGirl  and there was another post in response.

I am working with FactoryGirl for Heroku Pinger. I was trying to restrict users to only create five websites.

To test this, I copied the test “creates a few websites with FactoryGirl” into a new one called “only creates five websites with FactoryGirl”. It failed. It seemed to allow the user to only create 5 sites, but at the end of the test the site count (querying on user_id) was 11. I could not figure this out.

Then it hit me: FactoryGirl was creating an array of sites using the generate_factory_sites method. I would then take that array and try to create the sites again. The user’s site count is only checked in the website controller, while FactoryGirl uses the model directly. So by the time I am checking to see how many sites a user has, I am already over the limit, and the limit is set to 0. So I changed the test to put data for websites in an array of hashes first, and use the hashes to create the sites manually.

I might go back later and clean things up. (I was about to write “I will”, but let’s be honest: how often does that REALLY happen?)

I also put in something to ensure that a user can only delete their own sites. That was a hassle.

So what I tried to do is to add some sites with one user id, and then delete them with another, and then delete them with the first user ID. It did not work. The session variable user_id will not change. I read online that you can set it like this:
session[:user_id] = JDJDDJ
but that did not work.

So I added some sites, changed the user ID of that site instance, then deleted it with the first user ID.

I created two users and changed the user_id in the website model, and then tried to delete the sites. I was not able to delete the sites after changing the user_id field in the website objects. Then I changed it back and deleted the sites.

Thoughts On Math in Ruby After Lone Star

I gave a presentation on math in Ruby for Lone Star Ruby Conf.

I spent a lot of it talking about SciRuby. The SciRuby people were pretty happy that someone was spreading the word.

Here is a summary that I posted to the SciRuby mailing list:

I may not have done a great job of selling SciRuby.

A few people in attendance felt that the approach was wrong. They said that it might be best if Ruby was used as a DSL around some pre-existing libraries, like Numerical Recipes or GSL. And I did not have a whole lot to counter that with.

I did say that SciRuby is a fairly new project taking on a large large topic, and I did encourage people to look into it.

No replies to that so far. Perhaps I ticked them off. I will look at the GNU Science Library.

At some point I will post my presentation.

Notes From The Little Redis Book

I am working on a small Ruby app that uses Redis. It won’t really do much. I am just doing it to work with Redis a bit. I am using the Redis gem.

For my reference, here are some commands from The Little Redis Book. (I think this is allowed under the license.)

https://github.com/karlseguin/the-little-redis-book/blob/master/en/redis.md
set users:leto “{name: leto, planet: dune, likes: [spice]}”
get users:leto
set users:leto “{name: leto, planet: dune, likes: [spice]}”
> strlen users:leto
(integer) 42

> getrange users:leto 27 40
“likes: [spice]”

> append users:leto ” OVER 9000!!”
(integer) 54
> incr stats:page:about
(integer) 1
> incr stats:page:about
(integer) 2

> incrby ratings:video:12333 5
(integer) 5
> incrby ratings:video:12333 3
(integer) 8
hset users:goku powerlevel 9000
hget users:goku powerlevel

hmset users:goku race saiyan age 737
hmget users:goku race powerlevel
hgetall users:goku
hkeys users:goku
hdel users:goku age

lpush newusers goku
ltrim newusers 0 50

keys = redis.lrange(‘newusers’, 0, 10)
redis.mget(*keys.map {|u| “users:#{u}”})

sadd friends:leto ghanima paul chani jessica
sadd friends:duncan paul jessica alia

sismember friends:leto jessica
sismember friends:leto vladimir

sinter friends:leto friends:duncan

sinterstore friends:leto_duncan friends:leto friends:duncan

zadd friends:duncan 70 ghanima 95 paul 95 chani 75 jessica 1 vladimir

zcount friends:duncan 90 100

zrevrank friends:duncan chani

Has a good summary of Big O notation

Bad command:
set users:leto@dune.gov “{id: 9001, email: ‘leto@dune.gov’, …}”
set users:9001 “{id: 9001, email: ‘leto@dune.gov’, …}”

better:
set users:9001 “{id: 9001, email: leto@dune.gov, …}”
hset users:lookup:email leto@dune.gov 9001

get users:9001

id = redis.hget(‘users:lookup:email’, ‘leto@dune.gov’)
user = redis.get(“users:#{id}”)

sadd friends:leto ghanima paul chani jessica

sadd friends_of:chani leto paul

keys = redis.lrange(‘newusers’, 0, 10)
redis.mget(*keys.map {|u| “users:#{u}”})

sadd friends:vladimir piter
sadd friends:paul jessica leto “leto II” chani

Ruby:
redis.pipelined do
9001.times do
redis.incr(‘powerlevel’)
end
end

A transaction:
multi
hincrby groups:1percent balance -9000000000
hincrby groups:99percent balance 9000000000
exec

In Ruby:
redis.multi()
current = redis.get(‘powerlevel’)
redis.set(‘powerlevel’, current + 1)
redis.exec()

Or better yet:
redis.watch(‘powerlevel’)
current = redis.get(‘powerlevel’)
redis.multi()
redis.set(‘powerlevel’, current + 1)
redis.exec()

Bad idea in production:
keys bug:1233:*

Better:
hset bugs:1233 1 “{id:1, account: 1233, subject: ‘…’}”
hset bugs:1233 2 “{id:2, account: 1233, subject: ‘…’}”

expire pages:about 30
expireat pages:about 1356933600

ttl pages:about
persist pages:about

setex pages:about 30 ‘<h1>about us</h1>….’

It has pub/sub. Can it be used instead of/in place of something like Rabbit or ZeroMQ?
subscribe warnings

publish warnings “it’s over 9000!”

config set slowlog-log-slower-than 0

slowlog get
slowlog get 10

rpush users:leto:guesses 5 9 10 2 4 10 19 2
sort users:leto:guesses

sadd friends:ghanima leto paul chani jessica alia duncan
sort friends:ghanima limit 0 3 desc alpha

sadd watch:leto 12339 1382 338 9338

set severity:12339 3
set severity:1382 2
set severity:338 5
set severity:9338 4

sort watch:leto by severity:* desc

A group:
hset bug:12339 severity 3
hset bug:12339 priority 1
hset bug:12339 details “{id: 12339, ….}”

hset bug:1382 severity 2
hset bug:1382 priority 2
hset bug:1382 details “{id: 1382, ….}”

hset bug:338 severity 5
hset bug:338 priority 3
hset bug:338 details “{id: 338, ….}”

hset bug:9338 severity 4
hset bug:9338 priority 2
hset bug:9338 details “{id: 9338, ….}”
end of group

sort watch:leto by bug:*->priority get bug:*->details

sort watch:leto by bug:*->priority get bug:*->details store watch_by_priority:leto

config get *log*

rename-command CONFIG 5ec4db169f9d4dddacbfb0c26ea7e5ef
rename-command FLUSHALL 1041285018a942a4922cbf76623b741e

“type $KEY_NAME” is a good command as well

 

Notes From Lone Star Ruby Conf

I went to Lone Star Ruby Conf in Austin, TX last week. I also spoke about math in Ruby.

I have presented at Chicago Java Users Group in the past, but this was the first conference I spoke it. I practiced my presentation a few times before, but I still went a bit too fast. But some people were interested in the topic.

For now, I will dump some notes I took at the conference.

2012-08-11_12.20.14
Ruby Conf:
extend versus include:
an instance can extend a module
A class includes a module
You may only need to extend in certain places
DCI: Extending objects at runtime where it makes sense
DCI (Data Context Interaction)
ActionController has 361 methods (Object has 134 out of the box) (8th Light U was about something similar today)

http://jamesgolick.com/2012/5/22/objectify-a-better-way-to-build-rails-applications.html

2012-08-11_13.35.08
Performance in Ruby
Languages are not necessarily slow. Comes down to “big-O”
Majority of performance issues:
poor choice of data structures, algorithms (N + 1), capped speed of light
o(1), o(logn), o(n) (45 degree angle), o(n^2) (almost straight up)

twitter client in Ruby called “t”
lots of dependencies
$LOADED_FEATURES can be pretty big
Rails has 784 $LOADED_FEATURES
Watch this as you build your app
This is just startup time, not runtime – this is why free Heroku apps can take a while to start up
BRPOPLPUSH in Redis helped them out
Look at his Rails app: https://github.com/kowsik/vroom running at vroom-ruby.heroku.com
hit-rate: numbers of requests/second, not same as average response time
conurrency: number of open sockets, aka simultaneous users
single threaded sinatra has concurrency of one
Connection pools: jamming a lot of requests into size of your connection pool

Redis: they put it in front of Couch DB
Redis command docs will give you the big-o factor
sets can be o(n)
Some things in Lists can be o(n)
sets: sinter, sinterstore is o(n*m) – intersection of sets

Did I mention I met Matz?