<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-3915486011379503126</id><updated>2011-10-12T23:10:03.937-07:00</updated><category term='Ad Server'/><category term='Browse'/><category term='App'/><category term='User Targeting'/><category term='Future'/><category term='RTB'/><category term='Big Data'/><category term='Highly Scaleable Systems'/><category term='Digital Ad Ecosystem'/><title type='text'>AlokS Blog</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://fineobservations.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3915486011379503126/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://fineobservations.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Alok Sinha</name><uri>http://www.blogger.com/profile/02755631272268859913</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>11</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-3915486011379503126.post-6181180185783505595</id><published>2011-10-12T23:10:00.000-07:00</published><updated>2011-10-12T23:10:04.242-07:00</updated><title type='text'>Loading Lists into SQL Server</title><content type='html'>We recently run into a performance wall when loading objects of varying sizes (1 to say a million properties) into SQL Server 2008.&lt;br /&gt;&lt;br /&gt;As you will find that ADO DataTable (also referred to as Table Valued Paramater or TVP in SQL land) is the fastest way to load lists and arrays into SQL Server.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.codeproject.com/KB/database/SQL-2008-TableValue-Param.aspx"&gt;Table Value Parameter in SQL Server 2008&lt;/a&gt; and &lt;a href="http://www.sommarskog.se/arrays-in-sql-2008.html#Revisions"&gt;Handling Arrays and List in SQL&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Ok, this was not fast enough for our needs. We were comparing our object Write performance with correspondingly long string - and we kept coming up short. For example, if you load an array of strings say each 1k long via DataTable and you do the same with one simple SProc which takes one parameter (of one string of 10k long), doing one insert (predictably) is the fastest way.&lt;br /&gt;&lt;br /&gt;But, we would never know apriori, the size of array/list so we could not create an static SProc which takes the correct number of parameters.&lt;br /&gt;&lt;br /&gt;So, we tried to break up large arrays into say 5 parameters each and called a single SProc which took 5 parameters. Of course, we had N threads doing this (each thread calling the SProc). Intuitively, we would think that this will yield better throughput. But, it did not! DataTable (aka TVP) still beat the final throughput results - while still not meeting our Write throughput expectations.&lt;br /&gt;&lt;br /&gt;So, we profiled the actual SProc getting called. Before we mention what we found, to the SQL experts questions, we took out almost everything out of the sproc except the insert of N parameters (columns) into a 3 tables.&lt;br /&gt;&lt;br /&gt;We discovered that the ADO.Net, basically creates dynamic SQL like "insert into @p1 values.." before calling our SProc. So, even the fastest way of loading lists and array into SQL has some dynamic SQL which needs to be compiled each time!&lt;br /&gt;&lt;br /&gt;We asked the SQL Server team and they confirmed what we found. They however did suggest&amp;nbsp;&lt;a href="http://msdn.microsoft.com/en-us/library/microsoft.sqlserver.server.sqldatarecord.aspx"&gt;SQLDataRecord&lt;/a&gt; using and we found it did a yield a 5-10% improvement (but we were looking for 100% improvement). So, we will wait for next release of SQL Server which promises better throughput. &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3915486011379503126-6181180185783505595?l=fineobservations.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://fineobservations.blogspot.com/feeds/6181180185783505595/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3915486011379503126&amp;postID=6181180185783505595' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3915486011379503126/posts/default/6181180185783505595'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3915486011379503126/posts/default/6181180185783505595'/><link rel='alternate' type='text/html' href='http://fineobservations.blogspot.com/2011/10/loading-lists-into-sql-server.html' title='Loading Lists into SQL Server'/><author><name>Alok Sinha</name><uri>http://www.blogger.com/profile/02755631272268859913</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3915486011379503126.post-4330206439148896161</id><published>2011-09-09T13:55:00.000-07:00</published><updated>2011-09-09T13:55:54.066-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='User Targeting'/><title type='text'>Merging Facial Recognisation with Data About the user</title><content type='html'>&lt;div&gt;&lt;div&gt;One has to, of course, worry about loss of privacy and valid legal challenges.&lt;br /&gt;&lt;br /&gt;However, on a pure technical and business level, this is quite a convergence of targeting - being able to recognize a user (visually in this case) and attaching it to the data about the user (why stop at ads they clicked on, ...go on their consumer profile and find out what type of car they can afford..).&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;Alessandro Acquisti, Ph.D, a researcher and instructor still at Carnegie Mellon has designed an iPhone app that functions as a front end for PittPatt's facial recognition technology. As mentioned, it can identify strangers Facebook profiles with startling accuracy.&lt;br /&gt;&lt;br /&gt;And that's not all it can do. It also incorporates searches of public databases that allows it to make a good guess at your social security number. If it knows your date of birth (e.g. if your Facebook profile is public), there's a good chance it can ID your social security number. &lt;a href="http://www.dailytech.com/New+App+Can+ID+Complete+Strangers+Facebook+and+Social+Security+No/article22677.htm"&gt;More&lt;/a&gt;&lt;/blockquote&gt;The real question is how to make this technology available to users in way that makes their life a bit easier without sacrificing privacy (Remember a decade ago, you did not have a cell phone so your wife or your boss could not reach you when you travelled. Cell phones have made our life easier but we are now available all the time..)&lt;/div&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3915486011379503126-4330206439148896161?l=fineobservations.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://fineobservations.blogspot.com/feeds/4330206439148896161/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3915486011379503126&amp;postID=4330206439148896161' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3915486011379503126/posts/default/4330206439148896161'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3915486011379503126/posts/default/4330206439148896161'/><link rel='alternate' type='text/html' href='http://fineobservations.blogspot.com/2011/09/merging-facial-recognisation-with-data.html' title='Merging Facial Recognisation with Data About the user'/><author><name>Alok Sinha</name><uri>http://www.blogger.com/profile/02755631272268859913</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3915486011379503126.post-7901643730458561477</id><published>2011-08-12T10:39:00.000-07:00</published><updated>2011-08-12T10:45:05.683-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Big Data'/><title type='text'>Big Data - Visualizing the Exploding Data Growth</title><content type='html'>&lt;div&gt;I found this blog post really illuminating. The actual data volume size/growth curve may be off but at least, it captures in one place the deep magnitude of data&lt;br /&gt;&lt;br /&gt;&lt;a href="http://blog.getsatisfaction.com/2011/07/13/big-data/?view=socialstudies"&gt;http://blog.getsatisfaction.com/2011/07/13/big-data/?view=socialstudies&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3915486011379503126-7901643730458561477?l=fineobservations.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://fineobservations.blogspot.com/feeds/7901643730458561477/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3915486011379503126&amp;postID=7901643730458561477' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3915486011379503126/posts/default/7901643730458561477'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3915486011379503126/posts/default/7901643730458561477'/><link rel='alternate' type='text/html' href='http://fineobservations.blogspot.com/2011/08/big-data-visualizing-exploding-data.html' title='Big Data - Visualizing the Exploding Data Growth'/><author><name>Alok Sinha</name><uri>http://www.blogger.com/profile/02755631272268859913</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3915486011379503126.post-7176045561001019873</id><published>2010-08-18T13:21:00.001-07:00</published><updated>2010-08-18T13:25:29.340-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='App'/><category scheme='http://www.blogger.com/atom/ns#' term='Future'/><category scheme='http://www.blogger.com/atom/ns#' term='Browse'/><title type='text'>Web Browsing is coming to An End</title><content type='html'>Apps is the way to go - from Apps on iPhone, iPad, Facebook, NetTV...to Windows Desktop (I know old is new again).&lt;br /&gt;&lt;br /&gt;Great read: &lt;a href="http://www.wired.com/magazine/2010/08/ff_webrip/"&gt;http://www.wired.com/magazine/2010/08/ff_webrip/&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3915486011379503126-7176045561001019873?l=fineobservations.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://fineobservations.blogspot.com/feeds/7176045561001019873/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3915486011379503126&amp;postID=7176045561001019873' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3915486011379503126/posts/default/7176045561001019873'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3915486011379503126/posts/default/7176045561001019873'/><link rel='alternate' type='text/html' href='http://fineobservations.blogspot.com/2010/08/web-browsing-is-coming-to-end.html' title='Web Browsing is coming to An End'/><author><name>Alok Sinha</name><uri>http://www.blogger.com/profile/02755631272268859913</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3915486011379503126.post-3505208425325214130</id><published>2010-05-27T13:32:00.000-07:00</published><updated>2010-05-27T13:42:31.909-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='RTB'/><title type='text'>Good primer on RTB</title><content type='html'>We are moving to a RTB platform for serving Ads. More details will come in near future.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;I found these following posts to be really good read as RTB 101&lt;br /&gt;&lt;br /&gt;Part 1: &lt;a href="http://www.mikeonads.com/2009/08/30/rtb-part-i-what-is-it/"&gt;http://www.mikeonads.com/2009/08/30/rtb-part-i-what-is-it/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Part 2: &lt;a href="http://www.mikeonads.com/2009/09/19/rtb-part-ii-supply-supply-supply/"&gt;http://www.mikeonads.com/2009/09/19/rtb-part-ii-supply-supply-supply/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Part 3: &lt;a href="http://www.mikeonads.com/2010/02/22/rtb-part-iii-cookies-user-data/"&gt;http://www.mikeonads.com/2010/02/22/rtb-part-iii-cookies-user-data/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;/blockquote&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3915486011379503126-3505208425325214130?l=fineobservations.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://fineobservations.blogspot.com/feeds/3505208425325214130/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3915486011379503126&amp;postID=3505208425325214130' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3915486011379503126/posts/default/3505208425325214130'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3915486011379503126/posts/default/3505208425325214130'/><link rel='alternate' type='text/html' href='http://fineobservations.blogspot.com/2010/05/good-primer-on-rtb.html' title='Good primer on RTB'/><author><name>Alok Sinha</name><uri>http://www.blogger.com/profile/02755631272268859913</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3915486011379503126.post-715254350373394845</id><published>2010-05-21T11:47:00.000-07:00</published><updated>2010-05-21T11:55:12.841-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Digital Ad Ecosystem'/><title type='text'>Digital Ad Market</title><content type='html'>&lt;a href="http://4.bp.blogspot.com/_M9oj-P7EuHk/S_bW2bLm0uI/AAAAAAAAAW4/SXW-GwvTdBQ/s1600/Digital+Ad+EcoSystem.jpg"&gt;&lt;img style="MARGIN: 0px 0px 10px 10px; WIDTH: 320px; FLOAT: right; HEIGHT: 200px; CURSOR: hand" id="BLOGGER_PHOTO_ID_5473798627613135586" border="0" alt="" src="http://4.bp.blogspot.com/_M9oj-P7EuHk/S_bW2bLm0uI/AAAAAAAAAW4/SXW-GwvTdBQ/s320/Digital+Ad+EcoSystem.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;div&gt;The following slideware/talk is a must read for understanding the current ecosystem and for understanding the players. &lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;a href="http://www.slideshare.net/tkawaja/terence-kawajas-iab-networks-and-exchanges-keynote"&gt;http://www.slideshare.net/tkawaja/terence-kawajas-iab-networks-and-exchanges-keynote&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3915486011379503126-715254350373394845?l=fineobservations.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://fineobservations.blogspot.com/feeds/715254350373394845/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3915486011379503126&amp;postID=715254350373394845' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3915486011379503126/posts/default/715254350373394845'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3915486011379503126/posts/default/715254350373394845'/><link rel='alternate' type='text/html' href='http://fineobservations.blogspot.com/2010/05/digital-ad-market.html' title='Digital Ad Market'/><author><name>Alok Sinha</name><uri>http://www.blogger.com/profile/02755631272268859913</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_M9oj-P7EuHk/S_bW2bLm0uI/AAAAAAAAAW4/SXW-GwvTdBQ/s72-c/Digital+Ad+EcoSystem.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3915486011379503126.post-7327526850539516215</id><published>2010-05-06T11:41:00.000-07:00</published><updated>2010-05-06T12:03:30.660-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Highly Scaleable Systems'/><category scheme='http://www.blogger.com/atom/ns#' term='Ad Server'/><title type='text'>Building out a Scaleable Ad Platform</title><content type='html'>&lt;strong&gt;Building a Scale-able Ad Network&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;In this posting, I will provide the details our almost one year journey building out a scale-able Ad Network which can handle 10 million, 100 million, and yes in near future, 1 billion ads a day.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Phase 1: Building a CPA Network&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;Almost a year ago, we were mostly focused on a tool - Publisher Gallery - which publishers could use to find relevant creative/campaigns, and drag/drop into their playlist (usually matching a placement on their web site). At this time, as a business, we were mostly focused on being a performant (CPA) Ad Network. At that time, &lt;a href="http://www.openx.org/"&gt;OpenX&lt;/a&gt; (Community Download) fit the bill perfectly as it was open-source, had rich set of APIs, and most imporantly, we found a way (XMLRPC) to talk to its API. We had to modify a few things in it but it worked as advertised. We build a simple AdOps tool to allow uploading IO/Creatives into the OpenX System while the functionality-rich Publisher Gallery allowed specific creative to be added to specific placement (Gallery indirectly called OpenX API).&lt;br /&gt;&lt;br /&gt;Once we found the solution to serving ads (leveraging OpenX), we started building plumbing to make the network scaleable. First step was to use Amazon Web Services (AWS) to spin up as many OpenX Web Servers as necessary. Next step was to create replicated mysql servers so there is no single point of failure. This tested well for up to 10m a day.&lt;br /&gt;&lt;br /&gt;One hard problem we had to solve at this stage was creating/managing AWS EC2 instances. After finding &lt;a href="http://aws.amazon.com/console/"&gt;Amazon Management Console&lt;/a&gt; too primitive, we quickly moved to &lt;a href="https://s3.amazonaws.com/ec2-downloads/elasticfox.xpi"&gt;ElasticFox&lt;/a&gt; as a better way to manage the instances (BTW: Long term, we had to build our own version, taking away some functionality to avoid mistakes e.g. Delete an instance ).&lt;br /&gt;&lt;br /&gt;However, updating EC2 instances with latest software packages was a huge chore and used to take us days. So, after a bit of research, we opted to use &lt;a href="http://www.opscode.com/"&gt;Chef &lt;/a&gt;as a way to manage installation and configuration of many of the EC2 instances. Our thanks to Chef team for helping us in the beginning. We still use this tool and can instantiate couple of dozens of EC2 instances in less than 30 minutes. But, using the tool to manage your ever-evolving software releases requires a dedicated engineer, so expensive way to manage AWS EC2 instances.&lt;br /&gt;&lt;br /&gt;Next challenge was to gather all impression/revenue data from OpenX and make it available in Publisher Gallery and also surface it to AdOps/FinOps. For this purposes, we build a map/reduce process to periodically get data from OpenX and put summary tables into our databases (While the core code is not that complicated, we found it to be operationally challenging due to occasional network failures, need to rerun the process to restore data, and failures in databases themselves). Then the summary data was made available in the Publisher Gallery. For Ad Operations, we exposed the data, and many custom reports, via &lt;a href="http://www.jaspersoft.com/jaspersoft-business-intelligence-suite"&gt;Jasper&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Final piece of puzzle was adding more fault tolerance. So, we added couple of &lt;a href="http://haproxy.1wt.eu/"&gt;HAProxy&lt;/a&gt; (Load Balancer) in front of OpenX Web Servers. We also added a heart-beat monitor to detect non-functioning load balancer and make working load balancer the active LB. Next, we added a second mysql instance as a replica of OpenX Database and setup continuous replication (yes, we were still at OpenX 2.6 which had single master database). We experimented with some solution (mysql proxy, HAProxy, AWS EIP) but no one solution provided a seamless and automated failover. So, we decided to stay with manual failover to replica database, should primary database fail (see later about OpenX Distributed Statistics).&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Phase 2: Change over to CPM Network&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;Right about when we were about to release a feature-rich Gallery (v2) and had the CPA Ad Network functional, our core business shifted from CPA-focused to being CPM-focused. While functionally, our entire end-to-end system worked with minimal changes, all hell broke loose once the load starting arriving!&lt;br /&gt;&lt;br /&gt;Our first weakness showed in how we tested OpenX. Real world (ad-request) load patterns were different from what we assumed during testing. So, we quickly added some more OpenX Web Servers to handle the incoming load.&lt;br /&gt;&lt;br /&gt;Our next set of challenges was caused by our lack of experience with mysql administration and maintenance. First, we ran out of disk volume space which was solved by having mysql tables (and later binlog) on 100GB (and later terabyte) size Elastic Block Store (EBS) volumes. We also had to create an archive and delete process to reduce the raw impression table size so that read or write speeds were reasonable. Finally, we enabled full backup and restore system so despite failures, we can now always restore data.&lt;br /&gt;&lt;br /&gt;We also learned that for rapid transactional writes of impression data, the default ISAM storage engine was inadequate (table locking). So, we migrated to InnoDb storage engine (row-level locking) and had row-level contentions at scale. This prompted us to use &lt;a href="http://www.mysqlperformanceblog.com/2010/01/13/innodb-innodb-plugin-vs-xtradb-on-fast-storage/"&gt;Percona's xtradb&lt;/a&gt; to allow rapid and multiple updates to the same row. However, our next bottleneck was our OpenX single master database deployment and we moved to &lt;a href="http://www.openx.org/en/docs/2.6/adminguide/Distributed+statistics"&gt;OpenX Distributed Statistics&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Over time, we enabled a very detailed and elegant monitoring and alerting system using &lt;a href="http://www.nagios.org/"&gt;nagios&lt;/a&gt; and &lt;a href="http://munin-monitoring.org/"&gt;munin&lt;/a&gt; which allows us to be alerted in seconds, when failures do happen. We also got better at simply assuming that failures will happen so that the final system designs were more fault-tolerant and more redundant. For example, we setup a duplicate of full ad rendering system in a separate location (using AWS Availability Zone).&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Phase 3: CPM Pacing Problem &lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Once substantial number of campaigns with varying parameters were loaded into the OpenX Ad Server, we started noticing that sometimes certain campaigns were over or under paced. Worse, sometimes, blank ads were served. The latter caused embarrassment all around.&lt;br /&gt;&lt;br /&gt;So, we upgraded to OpenX 2.8.3 hoping that it will improve the pacing but it did not. When we talked to OpenX team, they recommended we abandon using OpenX Community Download version and move on to OpenX Enterprise (which they host). We tested pacing on their enterprise version and except for startup hiccup in first hour or two, it did remarkably better than their open-src version. However, we could not rapidly move a running network to the OpenX Enterprise as a) we found compatibility problems between our OpenX version and hosted version (In our a year, when we had found bugs/limitations in OpenX API, we had made direct calls to the OpenX database), b) getting data out of hosted OpenX was non-trivial work. By the way, we think very highly of the OpenX team and wish them all the success!&lt;br /&gt;&lt;br /&gt;So, we went ahead with our own pacing solution (Project Olympus) which we grafted into OpenX 2.8.3(Community version) in the banner selection code path. This way, when a creative request came, we will look up in memory-based tables which campaign/creative made the best match and returned a good match. Rest of OpenX was used as is (e.g. logging the request). Using this system, we were able to manage the campaigns at an even pace while maintaining daily and overall target requirements put in by the AdOps team. Yes, we also handled frequency capping and geographic restrictions. Without spilling the beans, this was a best of breed design and it worked at scale!&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Phase 4: Business Requirements Change Again&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;Due to market pressure, we had to rapidly become really large marketplace for publishers and advertisers. In order to do that we needed to latest and best of Ad Server features (e.g. Professional grade AdOp management and Reports, Real Time Bidding, Integration with DSPs, etc) and market place features (e.g. find advertiser and publishers outside our network).&lt;br /&gt;&lt;br /&gt;In this phase, we are moving to a fully functioning Ad Server/Marketplace which has all these features and more. So, we can focus on bringing transparency and control to our publishers. More on this in later posts.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3915486011379503126-7327526850539516215?l=fineobservations.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://fineobservations.blogspot.com/feeds/7327526850539516215/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3915486011379503126&amp;postID=7327526850539516215' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3915486011379503126/posts/default/7327526850539516215'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3915486011379503126/posts/default/7327526850539516215'/><link rel='alternate' type='text/html' href='http://fineobservations.blogspot.com/2010/05/building-out-scaleable-ad-platform.html' title='Building out a Scaleable Ad Platform'/><author><name>Alok Sinha</name><uri>http://www.blogger.com/profile/02755631272268859913</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3915486011379503126.post-2438425208055325218</id><published>2010-04-08T16:52:00.001-07:00</published><updated>2010-04-08T16:59:12.378-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Ad Server'/><title type='text'>Building Scaleable Ad Server</title><content type='html'>We started this journey last year building on top of an open src Ad Server platform(&lt;a href="http://www.openx.org/"&gt;http://www.openx.org&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;Here is the short version:&lt;br /&gt; - We were able to handle north of 20m/day traffic with peaks well north of that load&lt;br /&gt; - Yes, we dealt with all kinds of BIG problems you encounter when building scaleable systems&lt;br /&gt;    (replication failures, row-level contentions, too many connection to database,&lt;br /&gt;     exceeding memory..) and small problems caused by ourselves (binlog filling up, DNS&lt;br /&gt;    setup wrong, literally halt-ed a server by accident..)&lt;br /&gt;- At the end, we build a highly scaleable solution with pacing. Then the requirements changed!&lt;br /&gt;&lt;br /&gt;More in subsequent postings&lt;br /&gt;&lt;br /&gt;Here is a blog post on similar work effort: &lt;a href="http://www.mikeonads.com/2010/04/04/the-challenge-of-scaling-an-adserver/"&gt;http://www.mikeonads.com/2010/04/04/the-challenge-of-scaling-an-adserver/&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3915486011379503126-2438425208055325218?l=fineobservations.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://fineobservations.blogspot.com/feeds/2438425208055325218/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3915486011379503126&amp;postID=2438425208055325218' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3915486011379503126/posts/default/2438425208055325218'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3915486011379503126/posts/default/2438425208055325218'/><link rel='alternate' type='text/html' href='http://fineobservations.blogspot.com/2010/04/building-scaleable-ad-server.html' title='Building Scaleable Ad Server'/><author><name>Alok Sinha</name><uri>http://www.blogger.com/profile/02755631272268859913</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3915486011379503126.post-1314658319810068444</id><published>2010-03-11T13:13:00.000-08:00</published><updated>2010-03-11T13:14:47.289-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Highly Scaleable Systems'/><title type='text'>Using Cassandra for solving Big Data problems</title><content type='html'>Interesting presentation from fellow Seattle Startup.&lt;br /&gt;&lt;br /&gt;http://www.startupmonkeys.com/2010/03/cassandra-frugal-mechanic/&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3915486011379503126-1314658319810068444?l=fineobservations.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://fineobservations.blogspot.com/feeds/1314658319810068444/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3915486011379503126&amp;postID=1314658319810068444' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3915486011379503126/posts/default/1314658319810068444'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3915486011379503126/posts/default/1314658319810068444'/><link rel='alternate' type='text/html' href='http://fineobservations.blogspot.com/2010/03/using-cassandra-for-solving-big-data.html' title='Using Cassandra for solving Big Data problems'/><author><name>Alok Sinha</name><uri>http://www.blogger.com/profile/02755631272268859913</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3915486011379503126.post-7486144280275150621</id><published>2009-12-09T10:56:00.000-08:00</published><updated>2009-12-09T10:58:32.810-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Highly Scaleable Systems'/><title type='text'>Building a Highly Scaleable Ad Platform</title><content type='html'>Over the next couple of weeks, I will write about techncial problems we are solving at 5to1.com as we build a scaleable ad platform (read: billons of ads a day)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3915486011379503126-7486144280275150621?l=fineobservations.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://fineobservations.blogspot.com/feeds/7486144280275150621/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3915486011379503126&amp;postID=7486144280275150621' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3915486011379503126/posts/default/7486144280275150621'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3915486011379503126/posts/default/7486144280275150621'/><link rel='alternate' type='text/html' href='http://fineobservations.blogspot.com/2009/12/building-highly-scaleable-ad-platform.html' title='Building a Highly Scaleable Ad Platform'/><author><name>Alok Sinha</name><uri>http://www.blogger.com/profile/02755631272268859913</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3915486011379503126.post-1557243400157183703</id><published>2007-10-02T17:45:00.000-07:00</published><updated>2007-10-02T17:51:49.934-07:00</updated><title type='text'>Collaboration in a Dev Shop</title><content type='html'>These two good postings (&lt;a href="http://www.joelonsoftware.com/articles/FieldGuidetoDevelopers.html"&gt;http://www.joelonsoftware.com/articles/FieldGuidetoDevelopers.html&lt;/a&gt; and &lt;a title="http://recycledknowledge.blogspot.com/2005/06/flow-stuckness-and-interruptions.html" href="http://recycledknowledge.blogspot.com/2005/06/flow-stuckness-and-interruptions.html"&gt;http://recycledknowledge.blogspot.com/2005/06/flow-stuckness-and-interruptions.html&lt;/a&gt;) lead us to muse what works better for us at SecondSpace.com.&lt;br /&gt;&lt;br /&gt;Giving offices to developers is not the right answer. I saw for year at Microsoft how it lead to lack of collaborative work. But, the problem of sitting in a cube is that you get interrupted all the time and may not be productive either. So, need some nice bridge the two opposite camps. Jake @ secondspace.com had this to add:&lt;br /&gt;&lt;p&gt;&lt;em&gt;At All Star Directories, we had a lot of flow etiquette in place to make sure we weren’t interrupting unless it was urgent&lt;/em&gt;&lt;em&gt;&lt;br /&gt;&lt;/em&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;em&gt;Schedule collaboration time, even if it’s 15 minutes.  Easier to plan your day that way. &lt;/em&gt;&lt;/li&gt;&lt;li&gt;&lt;em&gt;Try not to interrupt someone because you didn’t get your work done and you now are&lt;/em&gt;&lt;em&gt; in fire-extinguisher-mode. &lt;/em&gt;&lt;/li&gt;&lt;li&gt;&lt;em&gt;If you need to talk to someone in person, ping them on Messenger first and ask if it’s okay. &lt;/em&gt;&lt;/li&gt;&lt;li&gt;&lt;em&gt;Emails are better information repositories than conversations. &lt;/em&gt;&lt;/li&gt;&lt;li&gt;&lt;em&gt;Have a signal that you really need to work and can’t afford unnecessary interruptions.  We would put up a post-it requesting people email or IM us. &lt;/em&gt;&lt;br /&gt; &lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3915486011379503126-1557243400157183703?l=fineobservations.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://fineobservations.blogspot.com/feeds/1557243400157183703/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3915486011379503126&amp;postID=1557243400157183703' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3915486011379503126/posts/default/1557243400157183703'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3915486011379503126/posts/default/1557243400157183703'/><link rel='alternate' type='text/html' href='http://fineobservations.blogspot.com/2007/10/collaboration-in-dev-shop.html' title='Collaboration in a Dev Shop'/><author><name>Alok Sinha</name><uri>http://www.blogger.com/profile/02755631272268859913</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry></feed>
