Académique Documents
Professionnel Documents
Culture Documents
Sunday, 17 March, 13
what is DISQUS?
Sunday, 17 March, 13
Sunday, 17 March, 13
why do realtime?
! ! ! !
getting new data to the user asap for increased engagement and it looks awesome and we can sell (or trade) it
Sunday, 17 March, 13
http://map.labs.disqus.com http://github.com/NorthIsUp/orbital2
Sunday, 17 March, 13
Sunday, 17 March, 13
realertime
!
currently active on all DISQUS sites tested dark on our existing network during testing:
! ! ! !
! !
1.5 million concurrently connected users 45 thousand new connections per second 165 thousand messages/second <.2 seconds latency end to end
Sunday, 17 March, 13
Sunday, 17 March, 13
Sunday, 17 March, 13
Sunday, 17 March, 13
Sunday, 17 March, 13
Sunday, 17 March, 13
thoonk redis queue some python glue nginx push stream and long(er) polling
Sunday, 17 March, 13
architecture overview
Sunday, 17 March, 13
old-june
DISQUS
memcache New Posts memcache
Sunday, 17 March, 13
june-july
DISQUS
redis pub/sub Flask FE cluster New Posts redis pub/sub
HA Proxy
Sunday, 17 March, 13
july-october
DISQUS
redis queue Flask FE cluster New Posts redis pub/sub
HA Proxy
august-october
DISQUS
redis queue New Posts redis pub/sub
6s 2
python glue python glue Gevent server Gevent server
erv ers
14 Flask FE BIG
cluster
5s
august-october
DISQUS
redis queue New Posts redis pub/sub
6s 2
erv ers
14 Flask FE BIG
2 for
5s
october-now
DISQUS
redis queue nginx + push stream module New Posts ngnix pub endpoint
october-now
DISQUS
redis queue New Posts ngnix pub endpoint
5
nginx Why still 5 for this? + Network memory restriction, we cant x this without kernel push stream hacking, tweaking, etc. module (if you know how, tell us, then apply for a job, then x it for us)
2
python glue Gevent server
october-now
django thoonk queue nginx + push stream module New Posts ngnix pub endpoint
Formatter
http post
thoonk redis queue some python glue nginx push stream and long(er) polling
Sunday, 17 March, 13
django post_save and post_delete hooks thoonk is a queue on top of redis implemented as a DFA provides job semantics
! !
useful for end to end acking reliable job processing in distributed system uses zset to store items == ranged queries
Sunday, 17 March, 13
thoonk redis queue some python glue nginx push stream and long(er) polling
Sunday, 17 March, 13
Formatter
this is the nal format for end clients compress data now
! !
Publishers
Sunday, 17 March, 13
gevent is nice
# the code is too big to show here, so just import it # http://bitly.com/geventspawn from realertime.lib.spawn import Watchdog from realertime.lib.spawn import TimeSensitiveBackoff
Sunday, 17 March, 13
data pipelines
class Pipeline(object): def parse_data(self, data): raise NotImplemented('No ParserMixin used') def compute_data(self, data, parsed_data): raise NotImplemented('No ComputeMixin used') def publish_data(self, data, parsed_data, computed_data): raise NotImplemented('No PublisherMixin used') def handle(self, data): parsed_data = self.parse_data(data) computed_data = self.compute_data(data, parsed_data) return self.publish_data(data, parsed_data, computed_data)
Sunday, 17 March, 13
Example Mixins
class JSONParserMixin(Pipeline): def parse_data(self, data): return json.loads(data) class AnnomizeDataMixin(Pipeline): def parse_data(self, data, parsed_data): return {} class SuperSecureEncryptDataMixin(Pipeline): def parse_data(self, data, parsed_data): return parsed_data.encode('rot13') class HTTPPublisher(Pipeline): def publish(self, data, parsed_data, computed_data): u = urllib2.urlopen(self.dat_url, computed_data) return u class FilePublisher(Pipeline): def publish(self, data, parsed_data, computed_data): with open(self.output, 'a') as f: f.write(computed_data)
Sunday, 17 March, 13
Finished Pipeline
class JSONAnnonHTTPPipeline( JSONParserMixin, AnnomizeDataMixin, HTTPPublisherMixin): pass class JSONSecureHTTPPipeline( JSONParserMixin, SuperSecureEncyptionMixin, HTTPPublisherMixin): pass class JSONAnnonFilePipeline( JSONParserMixin, AnnomizeDataMixin, FilePublisherMixin): pass
Sunday, 17 March, 13
Sunday, 17 March, 13
thoonk redis queue some python glue nginx push stream and long(er) polling
Sunday, 17 March, 13
follow John Watson (@wizputer) for updated #humblebrags as we ramp up tra!c an example cong can be found here: http://bit.ly/disqus-nginx-push-stream
http://wiki.nginx.org/HttpPushStreamModule
Sunday, 17 March, 13
! !
Replaced webservers and Redis Pub/Sub But starting with Pub/Sub was important for us
!
Sunday, 17 March, 13
~950K subscribers (peak single machine) peak 40 MBytes/second (per machine) CPU usage is still well under 15%
99.845% active writes (the socket is written to often enough to come up as ACTIVE)
http://wiki.nginx.org/HttpPushStreamModule
Sunday, 17 March, 13
http://wiki.nginx.org/HttpPushStreamModule
Sunday, 17 March, 13
examples
# Subs curl -s 'localhost/sub/forum/cnn' curl -s 'localhost/sub/thread/907824578' curl -s 'localhost/sub/user/northisup' # Pubs curl -s -X POST 'localhost/pub?channel=forum:cnn' \ -d '{"some sort": "of json data"}' curl -s -X POST 'localhost/pub?channel=thread:907824578' \ -d '{"more": "json data"}' curl -s -X POST 'localhost/pub?channel=user:northisup' \ -d '{"the idea": "I think you get it by now"}'
http://wiki.nginx.org/HttpPushStreamModule
Sunday, 17 March, 13
measure nginx
location = /push-stream-status { allow 127.0.0.1; deny all; push_stream_channels_statistics; set $push_stream_channel_id $arg_channel; }
http://wiki.nginx.org/HttpPushStreamModule
Sunday, 17 March, 13
thoonk redis queue some python glue nginx push stream and long(er) polling
Sunday, 17 March, 13
long(er) polling
onProgress: function () { var self = this; var resp = self.xhr.responseText; var advance = 0; var rows; // If server didn't push anything new, do nothing. if (!resp || self.len === resp.length) return; // Server returns JSON objects, one per line. rows = resp.slice(self.len).split('\n'); _.each(rows, function (obj) { advance += (obj.length + 1); obj = JSON.parse(obj); self.trigger('progress', obj); }); self.len += advance; }
Sunday, 17 March, 13
Soon... EventSource
Sunday, 17 March, 13
Sunday, 17 March, 13
test
Darktime
! !
use existing network to load test (user complaints when it didnt work...) load testing a single thread
Darkesttime
!
Sunday, 17 March, 13
measure
! ! ! !
measure all the things! especially when the numbers dont line up measuring is hard in distributed systems try to express things as +1 and -1 if you can Sentry for measuring exceptions
Sunday, 17 March, 13
pretty graphs
Sunday, 17 March, 13
POPE
white smoke francis announced
Sunday, 17 March, 13
maths
Sunday, 17 March, 13
Sunday, 17 March, 13
wha?
! ! !
People do weird stu" with your stu" turned o" this server in Oct 2012 Still getting 100 req/sec
Sunday, 17 March, 13
lessons
! ! !
do hard (computation) work early end-to-end acks are good, but expensive redis/nginx pubsub is e"ectively free
Sunday, 17 March, 13
special thanks
! !
the team at DISQUS like je" a.k.a. @nuxx who had to review all my code and especially our dev-ops guys like john watson a.k.a. @wizputer who found the nginx-push-stream module
psst, were hiring disqus.com/jobs
! !
Sunday, 17 March, 13
Nginx push stream module http://wiki.nginx.org/HttpPushStreamModule Thoonk (redis queue) http://github.com/andyet/thoonk.py Sentry (distributed traceback aggregation) http://github.com/dcramer/sentry Gevent (python coroutines and greenlets) http://gevent.org/ Scales (in-app metrics) http://github.com/Greplin/scales code.disqus.com
Sunday, 17 March, 13
Come nd me here!
8x20
!"#$%&'()*
8 19
10
20
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x20
)2!<+ = *-$(>,
10x20
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
10x15
20
20
20
20
Sunday, 17 March, 13
Questions I have
!
What is the best kernel cong for webscale concurrency. Nginx? I <3 gevent, but what if I want to pypy? Nginx + lua? Seems kind of awesome. Composing data pipelines: good or bad? I didnt have time to mention:
! !
! ! ! !
Sunday, 17 March, 13
DISQUSsion?
Adam Hitchcock @NorthIsUp
Sunday, 17 March, 13