Vous êtes sur la page 1sur 37

On Rabbits and Elephants

Data Distribution with pg_amqp and RabbitMQ


Gavin M. Roy @crad - pgConf.eu 2011

myYearbook.com
#1 Comscore Teen Destination, Top Trafcked Sites in US who is hiring Use of PostgreSQL predates my joining as CTO in 2007 and were hiring Use of RabbitMQ goes back to 2009, we were hiring then too Awesome place to work, have been known to sponsor Europeans for work in the US and were hiring
Gavin M. Roy @crad - pgConf.eu 2011

Don't try this at home.


(try it at work or at least on your laptop)

Gavin M. Roy @crad - pgConf.eu 2011

Audience Participation
Youll need PostgreSQL 9.1rc1 with pgbench installed Python 2.6 or Python 2.7 Tarball from http://192.168.103.135/~gmr/ Untar and then ./setup-client.py

Gavin M. Roy @crad - pgConf.eu 2011

Use Cases
Adding complexity to your environment Ensuring data inconsistency between canonical data sources and data warehouses Impressing your hipster Ruby programmer friends Replace your Slony replication setup
Gavin M. Roy @crad - pgConf.eu 2011

Use Cases
Write distributed map-reduce functions in Erlang

Gavin M. Roy @crad - pgConf.eu 2011

What is PostgreSQL?
(just kidding)

Gavin M. Roy @crad - pgConf.eu 2011

What is RabbitMQ?
(dont let the cloud based marketing deter you)

Gavin M. Roy @crad - pgConf.eu 2011

About AMPQ
Platform, Vendor Neutral Messaging format 0.9.1 nalized in November 2008 JP Morgan Chase, Red Hat, Rabbit Technologies, iMatrix, IONA Technologies, Cisco Systems, Envoy Technologies, Twist Process Innovations Members now include Bank of America, Barclays Banks, Microsoft Corporation, Novell, VMWare 1.0 Final: October 2011 Changes some stuff
Gavin M. Roy @crad - pgConf.eu 2011

RabbitMQ & AMQP Concepts


Going to only scratch the surface

Gavin M. Roy @crad - pgConf.eu 2011

Management Plugin

Gavin M. Roy @crad - pgConf.eu 2011

Publishers

Gavin M. Roy @crad - pgConf.eu 2011

Message Broker

Gavin M. Roy @crad - pgConf.eu 2011

Consumers

Gavin M. Roy @crad - pgConf.eu 2011

Message Routing
Its not just about Queues.
Gavin M. Roy @crad - pgConf.eu 2011

Routing Key
Helps Determine Message Destination (in most cases)

Gavin M. Roy @crad - pgConf.eu 2011

Queues
Durability: Queue survives reboot Auto-Delete: Delete the queue when consumer goes away Exclusive: Only one consumer may consume x-expires: Auto-Delete after given duration x-message-ttl: Auto-discard a message after ttl
Gavin M. Roy @crad - pgConf.eu 2011

Declaring a Queue

Gavin M. Roy @crad - pgConf.eu 2011

Exchanges
(The ones we care about for this talk)

Gavin M. Roy @crad - pgConf.eu 2011

Direct exchanges
1:1 Message Delivery Pattern Routing Keys are direct, no wildcarding All queues bound to a routing key get the message

Gavin M. Roy @crad - pgConf.eu 2011

Topic exchanges
Pattern Matching in Routing Keys Publish one message type to public.pgbench_accounts another public.pgbench_history Period delimiter, * stays within the period scope, # does not Example patterns: *.pgbench_accounts, public.*
Gavin M. Roy @crad - pgConf.eu 2011

Other Exchange Types


Built-In: Fanout, Headers 3rd Party or Plugin Types: Consistent Hash Exchange: Round-robin distribution to queues External: RPC Exchange Plugin Riak Last-Value Cache Recent History Exchange: Keeps last 20 messages routed Global Fanout: Sends to all queues, regardless of routing key

Gavin M. Roy @crad - pgConf.eu 2011

One More thing...


Exchange-to-Exchange Bindings

Gavin M. Roy @crad - pgConf.eu 2011

Messages
Have client specied properties: Content type, Encoding, Timestamp, App-Id, User-Id, Headers Delivery Mode: 1: Non-Persistent 2: Persistent
Gavin M. Roy @crad - pgConf.eu 2011

Step-By-Step Instructions
1. Install your Erlang on the machine you into intend to run your RabbitMQ broker on. My preference is for Erlang R14b2, which almost makes me think they name Erlang releases after Star Wars characters, but they don't. Erlang is not as fun as Debian is. It doesn't need to be a beefy machine but it should have enough ram that if you don't have an active consumer it won't run out of memory. 2. Install RabbitMQ. This is a multistep process involving black magic. Depending on your distribution you can either download a package, or if your *nix impaired there is a windows installer too. Oh and don't forget to install the management plugin. You can download the management plugin from the RqbbitMQ site pretty easy. After you have downloaded it you put it in the RabbitMQ plugins directory. Kind of makes sense doesn't it?

3. Go to the store and buy RabbitMQ in Action and RabbitMQ in Detail. What do you mean you can't? No pre-order either? Well there are plenty of good blogs on the intarwebs that have good information such as *cough* mine. Seriously though if you are crazy enough to follow my advice and get this far out of actual interest in the subject, make sure you verse yourself in the operation of and the exibility that is RabbitMQ.

4. Install one of the following: pl/Python, pl/Perl, pl/Ruby, pl/PHP, pl/PGSQL and pg_amqp, or write your own damn addon in c using librabbitmq and then distribute it using the very cool pgxn service. Buoy only need this on the databases your going to be writing triggers on. Are you still reading this? Seriously? I would have thought I had lost you by now. Either that or you are skipping ahead because I should have changed the slide by now. It could be that Selena is operating the slides, in which case I have no control over these slides which is a terrifying thought. You see there is very little content on the slides, a real lack of substance. If I stay on any one slide for too long then you'll realize I really don't know what I am talking about and then I'd not be asked back to give one of the most important talks of the conference.

5. It's almost time to write your publisher, but don't get too far ahead of yourself. You rst need to make sure that you have gured out a routing paradigm to use. Don't you like that word? Paradigm. I hear it in my head as para dig'um. Anyway, back to routing paradigms. Depending on your use case you will have the ability to broadcast your events to a whole slew of databases that care about what you have to send, or just one. You can implement a 1-to-1 type of scenario where a "direct" exchange is setup and your data will be queued in a way that no message is duplicated to your client and there is transactional guarantees in RabbitMQ. You probably don't want that thou, as you're a y -by-the-seat-of-your-pants type of person, aren't you? For that you'll want to turn on noAck mode in your consumers. Rabbit will just ing your messages to your consumers as fast as possible, just like an ape at a zoo and his favorite projectile.

6. This is where things get fun,. We are going to quite. Stored procedure in the language of your choice and have it publish to RabbitMQ using an exchange and a routing key, now at this point if you are still reading this, I have to question your sanity. This was a one off gag slide to make people at the conference think that I wax going to read all of this. Anyway, wirte your function that publishes and you are almost done. No seriously. Well not quite done because you still need the part that calls this function and then the consumer app that reads the data from RabbitMQ and then doe the actions in PostgreSQL that you want it to do.

7. Now, basically you're going to add after insert, update, delete triggers to the tables you care about. I'm. Not going to go into detail about how to do that since you are at a PostgreSQL. Conference and should know how to do that already. Instead I am going to assume you know what I am writing about and say that in these triggers you will want to call the function we wrote in step 6 from these trigger events,

8. Go to http://github.com/gmr/On-Rabbits-and-Elephants/ and download the sample code and use that instead of all of this nonsense.

Gavin M. Roy @crad - pgConf.eu 2011

pg_amqp
Is a publisher only PostgreSQL extension Transactional if you use amqp.publish Not if you use amqp.autonomous_publish Needs some <3 for more advanced features like delivery mode changing and message properties.
Gavin M. Roy @crad - pgConf.eu 2011

Use a Trigger
(benet from the value of a trigger?)

Gavin M. Roy @crad - pgConf.eu 2011

Not that kind of trigger.


Gavin M. Roy @crad - pgConf.eu 2011

Maybe this kind.


Gavin M. Roy @crad - pgConf.eu 2011

Really this kind.


(pretty boring, I know)
Gavin M. Roy @crad - pgConf.eu 2011

Stop. Demo Time.


Gavin M. Roy @crad - pgConf.eu 2011

Demo
Using pgbench PostgreSQL 9.1rc1 pg_amqp and plpythonu RabbitMQ 2.6.1 Python 2.7 (Should work in 2.5 or 2.6 too)
Gavin M. Roy @crad - pgConf.eu 2011

Generic Trigger
Publishes to schema.table_name Wraps the entire trigger data set into a JSON payload

Gavin M. Roy @crad - pgConf.eu 2011

consumer.py
Dynamically builds SQL based upon event type and executes on a remote PostgreSQL database

./consumer.py 192.168.103.135

Gavin M. Roy @crad - pgConf.eu 2011

Proof!
./watch-messages.py 192.168.103.135 ./watch-db-client.py

Gavin M. Roy @crad - pgConf.eu 2011

Interactive Demo
Gavin M. Roy @crad - pgConf.eu 2011

Resources
rabbitmq:
http://www.rabbitmq.com

pg_amqp:
easy_install pgxnclient USE_PGXS=1 pgxn install pg_amqp

ack: dback eedb u/fee ive f se g pgconf.e Plea 011. p://2 htt

http://pgxn.org/dist/pg_amqp/

example code:
https://github.com/gmr/On-Rabbits-and-Elephants pgConf.eu Branch

follow me on twitter: @crad


Gavin M. Roy @crad - pgConf.eu 2011