twitterize
by Jacob Harris
http://www.nycruby.org/
== DESCRIPTION:
Twitterize is a quick and dirty hack I did in a few hours to play with the Twitter API (seriously, there are no tests and I'm sure there is code crude enough in there to make you recant any friendship we might have). It allows you to take any number of RSS feeds and post them to one or more Twitter accounts. An example of this is how various RSS feeds from the New York Times are sent to twitter accounts nytimes, nyt_arts, nyt_biz, etc. This is accomplished via a command-line script that requires a separate configuration file (see below). Since Twitter is a rapidly growing (read somewhat flaky) service, twitterize also uses a database to store twitters to be posted and recover later if twitter is down. This also allows the app to retain feed GUIDs and avoid duplicate posts.
Twitterize has two execution stages. In the first stage, it checks if any feeds are ready to be refreshed, downloads new articles, and saves outgoing twitter messages in the database. In the second stage, it posts any new twitters up to their corresponding accounts. Both stages are independent and can be thought of as a producer-consumer model. When twitterize finishes phase two, it exits. Rather than run continuously as a daemon, I think it's much better if you execute twitterize every 5-30 minutes via a cronjob.
== FEATURES/PROBLEMS:
* Did I mention before this is hackish software? I didn't even write unit tests (I know, lame!) so you might encounter bugs.
* Although twitterize runs with a database store (any ActiveRecord-supported DB will do), the list of feeds is updated on startup from feeds specified in the config file. Adding a new feed is as easy as writing some YAML instead of wrangling some SQL (or me coding some more app logic)
* However, twitterize does NOT remove feeds from the DB if you remove them from the config file. Renaming feeds is not a good idea either.
* Twitterize doesn't setup the DB or config file for you either. Both of these are possible, but I'm too lazy to do that right now.
* RSS/Atom/RDF feeds should all work just fine (thanks to Feed Tools)
* Feed reading is wholesale and crude. I don't support if-modified-since or ETags yet, so please be gentle.
* I think I've nailed down all the ISO-8859-1/UTF-8 issues, but it's possible you might still get stung.
* Twitterize does not purge old records from the database. That's up to you, but be careful being too aggressive with infrequently updated blogs.
* Twitterize is but a cold logical problem. Even if its twitter accounts are your friends, it still does not love you.
== SYNOPSIS:
twitterize --config-file ~/twitter.yml --verbose --lookback 12h
This illustrates some of the salient operational points of twitterize. It takes a config file (see below for format). You can turn on some manner of useless chattering with the --verbose option. The lookback option is actually quite useful, specifying it tells twitter to log feed items earlier that the lookback window, but not post them to Twitter (the h,m,d modifiers are for hours/minutes/days). This is good if you add a bunch of new feeds and don't want to send stories in your blog feed from 2 months ago to Twitter (keep it fresh). I suppose this could be a config-file option, but it was useful for me to make it a command-line option when testing out the NY Times feeds.
== REQUIREMENTS:
* Twitterize itself is dependent on the following gems:
* feedtools
* htmlentities
* activerecord
* shorturl (= 0.8.2)
* There actually is a bug with shortul-0.8.3 and posting to tinyurl, so don't use it (the writer changed the default protocol from post to get)
* Of course, you also need a database to store twitterize's data.
* Finally, the tedious and slow process of setting up Twitter accounts is still manual and each one needs a distinct email address (the [email protected] trick seems to work for some mailer daemons)
== INSTALL:
* sudo gem install twitterize
* setup the database. Here is the schema for MySQL for instance:
CREATE TABLE `feeds` (
`id` int(11) NOT NULL auto_increment,
`name` varchar(255) default NULL,
`url` text,
`user` varchar(255) default NULL,
`password` varchar(255) default NULL,
`last_check` datetime default NULL,
`next_check` datetime default NULL,
`interval` int(11) default NULL,
PRIMARY KEY (`id`)
) DEFAULT CHARSET=utf8
CREATE TABLE `items` (
`id` int(11) NOT NULL auto_increment,
`feed_id` int(11) default NULL,
`title` varchar(255) default NULL,
`guid` varchar(255) default NULL,
`link` text,
`twitter` varchar(255) default NULL,
`published_at` datetime default NULL,
`posted` tinyint(4) default '0',
`posted_at` datetime default NULL,
PRIMARY KEY (`id`)
) DEFAULT CHARSET=utf8
* create a twitter.yml config file somewhere. This looks like the following
database:
active record settings
feeds:
name1:
url: the url of the feed
user: the twitter account to post to
password: the twitter password
interval: (secs, optional) to force updates, despite ttl (default: 30 mins)
name2, etc.
* You can add more feeds to the config.yml at a later time and they will be added to the internal database with twitterize runs next.
== LICENSE:
(The MIT License)
Copyright (c) 2007 FIX
Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
'Software'), to deal in the Software without restriction, including
without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to
permit persons to whom the Software is furnished to do so, subject to
the following conditions:
The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
by Jacob Harris
http://www.nycruby.org/
== DESCRIPTION:
Twitterize is a quick and dirty hack I did in a few hours to play with the Twitter API (seriously, there are no tests and I'm sure there is code crude enough in there to make you recant any friendship we might have). It allows you to take any number of RSS feeds and post them to one or more Twitter accounts. An example of this is how various RSS feeds from the New York Times are sent to twitter accounts nytimes, nyt_arts, nyt_biz, etc. This is accomplished via a command-line script that requires a separate configuration file (see below). Since Twitter is a rapidly growing (read somewhat flaky) service, twitterize also uses a database to store twitters to be posted and recover later if twitter is down. This also allows the app to retain feed GUIDs and avoid duplicate posts.
Twitterize has two execution stages. In the first stage, it checks if any feeds are ready to be refreshed, downloads new articles, and saves outgoing twitter messages in the database. In the second stage, it posts any new twitters up to their corresponding accounts. Both stages are independent and can be thought of as a producer-consumer model. When twitterize finishes phase two, it exits. Rather than run continuously as a daemon, I think it's much better if you execute twitterize every 5-30 minutes via a cronjob.
== FEATURES/PROBLEMS:
* Did I mention before this is hackish software? I didn't even write unit tests (I know, lame!) so you might encounter bugs.
* Although twitterize runs with a database store (any ActiveRecord-supported DB will do), the list of feeds is updated on startup from feeds specified in the config file. Adding a new feed is as easy as writing some YAML instead of wrangling some SQL (or me coding some more app logic)
* However, twitterize does NOT remove feeds from the DB if you remove them from the config file. Renaming feeds is not a good idea either.
* Twitterize doesn't setup the DB or config file for you either. Both of these are possible, but I'm too lazy to do that right now.
* RSS/Atom/RDF feeds should all work just fine (thanks to Feed Tools)
* Feed reading is wholesale and crude. I don't support if-modified-since or ETags yet, so please be gentle.
* I think I've nailed down all the ISO-8859-1/UTF-8 issues, but it's possible you might still get stung.
* Twitterize does not purge old records from the database. That's up to you, but be careful being too aggressive with infrequently updated blogs.
* Twitterize is but a cold logical problem. Even if its twitter accounts are your friends, it still does not love you.
== SYNOPSIS:
twitterize --config-file ~/twitter.yml --verbose --lookback 12h
This illustrates some of the salient operational points of twitterize. It takes a config file (see below for format). You can turn on some manner of useless chattering with the --verbose option. The lookback option is actually quite useful, specifying it tells twitter to log feed items earlier that the lookback window, but not post them to Twitter (the h,m,d modifiers are for hours/minutes/days). This is good if you add a bunch of new feeds and don't want to send stories in your blog feed from 2 months ago to Twitter (keep it fresh). I suppose this could be a config-file option, but it was useful for me to make it a command-line option when testing out the NY Times feeds.
== REQUIREMENTS:
* Twitterize itself is dependent on the following gems:
* feedtools
* htmlentities
* activerecord
* shorturl (= 0.8.2)
* There actually is a bug with shortul-0.8.3 and posting to tinyurl, so don't use it (the writer changed the default protocol from post to get)
* Of course, you also need a database to store twitterize's data.
* Finally, the tedious and slow process of setting up Twitter accounts is still manual and each one needs a distinct email address (the [email protected] trick seems to work for some mailer daemons)
== INSTALL:
* sudo gem install twitterize
* setup the database. Here is the schema for MySQL for instance:
CREATE TABLE `feeds` (
`id` int(11) NOT NULL auto_increment,
`name` varchar(255) default NULL,
`url` text,
`user` varchar(255) default NULL,
`password` varchar(255) default NULL,
`last_check` datetime default NULL,
`next_check` datetime default NULL,
`interval` int(11) default NULL,
PRIMARY KEY (`id`)
) DEFAULT CHARSET=utf8
CREATE TABLE `items` (
`id` int(11) NOT NULL auto_increment,
`feed_id` int(11) default NULL,
`title` varchar(255) default NULL,
`guid` varchar(255) default NULL,
`link` text,
`twitter` varchar(255) default NULL,
`published_at` datetime default NULL,
`posted` tinyint(4) default '0',
`posted_at` datetime default NULL,
PRIMARY KEY (`id`)
) DEFAULT CHARSET=utf8
* create a twitter.yml config file somewhere. This looks like the following
database:
active record settings
feeds:
name1:
url: the url of the feed
user: the twitter account to post to
password: the twitter password
interval: (secs, optional) to force updates, despite ttl (default: 30 mins)
name2, etc.
* You can add more feeds to the config.yml at a later time and they will be added to the internal database with twitterize runs next.
== LICENSE:
(The MIT License)
Copyright (c) 2007 FIX
Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
'Software'), to deal in the Software without restriction, including
without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to
permit persons to whom the Software is furnished to do so, subject to
the following conditions:
The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.