Wednesday, October 12, 2011

Loading Lists into SQL Server

We recently run into a performance wall when loading objects of varying sizes (1 to say a million properties) into SQL Server 2008.

As you will find that ADO DataTable (also referred to as Table Valued Paramater or TVP in SQL land) is the fastest way to load lists and arrays into SQL Server.

Table Value Parameter in SQL Server 2008 and Handling Arrays and List in SQL

Ok, this was not fast enough for our needs. We were comparing our object Write performance with correspondingly long string - and we kept coming up short. For example, if you load an array of strings say each 1k long via DataTable and you do the same with one simple SProc which takes one parameter (of one string of 10k long), doing one insert (predictably) is the fastest way.

But, we would never know apriori, the size of array/list so we could not create an static SProc which takes the correct number of parameters.

So, we tried to break up large arrays into say 5 parameters each and called a single SProc which took 5 parameters. Of course, we had N threads doing this (each thread calling the SProc). Intuitively, we would think that this will yield better throughput. But, it did not! DataTable (aka TVP) still beat the final throughput results - while still not meeting our Write throughput expectations.

So, we profiled the actual SProc getting called. Before we mention what we found, to the SQL experts questions, we took out almost everything out of the sproc except the insert of N parameters (columns) into a 3 tables.

We discovered that the ADO.Net, basically creates dynamic SQL like "insert into @p1 values.." before calling our SProc. So, even the fastest way of loading lists and array into SQL has some dynamic SQL which needs to be compiled each time!

We asked the SQL Server team and they confirmed what we found. They however did suggest SQLDataRecord using and we found it did a yield a 5-10% improvement (but we were looking for 100% improvement). So, we will wait for next release of SQL Server which promises better throughput.









Friday, September 9, 2011

Merging Facial Recognisation with Data About the user

One has to, of course, worry about loss of privacy and valid legal challenges.

However, on a pure technical and business level, this is quite a convergence of targeting - being able to recognize a user (visually in this case) and attaching it to the data about the user (why stop at ads they clicked on, ...go on their consumer profile and find out what type of car they can afford..).

Alessandro Acquisti, Ph.D, a researcher and instructor still at Carnegie Mellon has designed an iPhone app that functions as a front end for PittPatt's facial recognition technology. As mentioned, it can identify strangers Facebook profiles with startling accuracy.

And that's not all it can do. It also incorporates searches of public databases that allows it to make a good guess at your social security number. If it knows your date of birth (e.g. if your Facebook profile is public), there's a good chance it can ID your social security number. More
The real question is how to make this technology available to users in way that makes their life a bit easier without sacrificing privacy (Remember a decade ago, you did not have a cell phone so your wife or your boss could not reach you when you travelled. Cell phones have made our life easier but we are now available all the time..)

Friday, August 12, 2011

Big Data - Visualizing the Exploding Data Growth

I found this blog post really illuminating. The actual data volume size/growth curve may be off but at least, it captures in one place the deep magnitude of data

http://blog.getsatisfaction.com/2011/07/13/big-data/?view=socialstudies

Wednesday, August 18, 2010

Web Browsing is coming to An End

Apps is the way to go - from Apps on iPhone, iPad, Facebook, NetTV...to Windows Desktop (I know old is new again).

Great read: http://www.wired.com/magazine/2010/08/ff_webrip/

Thursday, May 27, 2010

Good primer on RTB

We are moving to a RTB platform for serving Ads. More details will come in near future.





I found these following posts to be really good read as RTB 101

Part 1: http://www.mikeonads.com/2009/08/30/rtb-part-i-what-is-it/

Part 2: http://www.mikeonads.com/2009/09/19/rtb-part-ii-supply-supply-supply/

Part 3: http://www.mikeonads.com/2010/02/22/rtb-part-iii-cookies-user-data/




Friday, May 21, 2010

Digital Ad Market


The following slideware/talk is a must read for understanding the current ecosystem and for understanding the players.



Thursday, May 6, 2010

Building out a Scaleable Ad Platform

Building a Scale-able Ad Network

In this posting, I will provide the details our almost one year journey building out a scale-able Ad Network which can handle 10 million, 100 million, and yes in near future, 1 billion ads a day.

Phase 1: Building a CPA Network

Almost a year ago, we were mostly focused on a tool - Publisher Gallery - which publishers could use to find relevant creative/campaigns, and drag/drop into their playlist (usually matching a placement on their web site). At this time, as a business, we were mostly focused on being a performant (CPA) Ad Network. At that time, OpenX (Community Download) fit the bill perfectly as it was open-source, had rich set of APIs, and most imporantly, we found a way (XMLRPC) to talk to its API. We had to modify a few things in it but it worked as advertised. We build a simple AdOps tool to allow uploading IO/Creatives into the OpenX System while the functionality-rich Publisher Gallery allowed specific creative to be added to specific placement (Gallery indirectly called OpenX API).

Once we found the solution to serving ads (leveraging OpenX), we started building plumbing to make the network scaleable. First step was to use Amazon Web Services (AWS) to spin up as many OpenX Web Servers as necessary. Next step was to create replicated mysql servers so there is no single point of failure. This tested well for up to 10m a day.

One hard problem we had to solve at this stage was creating/managing AWS EC2 instances. After finding Amazon Management Console too primitive, we quickly moved to ElasticFox as a better way to manage the instances (BTW: Long term, we had to build our own version, taking away some functionality to avoid mistakes e.g. Delete an instance ).

However, updating EC2 instances with latest software packages was a huge chore and used to take us days. So, after a bit of research, we opted to use Chef as a way to manage installation and configuration of many of the EC2 instances. Our thanks to Chef team for helping us in the beginning. We still use this tool and can instantiate couple of dozens of EC2 instances in less than 30 minutes. But, using the tool to manage your ever-evolving software releases requires a dedicated engineer, so expensive way to manage AWS EC2 instances.

Next challenge was to gather all impression/revenue data from OpenX and make it available in Publisher Gallery and also surface it to AdOps/FinOps. For this purposes, we build a map/reduce process to periodically get data from OpenX and put summary tables into our databases (While the core code is not that complicated, we found it to be operationally challenging due to occasional network failures, need to rerun the process to restore data, and failures in databases themselves). Then the summary data was made available in the Publisher Gallery. For Ad Operations, we exposed the data, and many custom reports, via Jasper.

Final piece of puzzle was adding more fault tolerance. So, we added couple of HAProxy (Load Balancer) in front of OpenX Web Servers. We also added a heart-beat monitor to detect non-functioning load balancer and make working load balancer the active LB. Next, we added a second mysql instance as a replica of OpenX Database and setup continuous replication (yes, we were still at OpenX 2.6 which had single master database). We experimented with some solution (mysql proxy, HAProxy, AWS EIP) but no one solution provided a seamless and automated failover. So, we decided to stay with manual failover to replica database, should primary database fail (see later about OpenX Distributed Statistics).

Phase 2: Change over to CPM Network

Right about when we were about to release a feature-rich Gallery (v2) and had the CPA Ad Network functional, our core business shifted from CPA-focused to being CPM-focused. While functionally, our entire end-to-end system worked with minimal changes, all hell broke loose once the load starting arriving!

Our first weakness showed in how we tested OpenX. Real world (ad-request) load patterns were different from what we assumed during testing. So, we quickly added some more OpenX Web Servers to handle the incoming load.

Our next set of challenges was caused by our lack of experience with mysql administration and maintenance. First, we ran out of disk volume space which was solved by having mysql tables (and later binlog) on 100GB (and later terabyte) size Elastic Block Store (EBS) volumes. We also had to create an archive and delete process to reduce the raw impression table size so that read or write speeds were reasonable. Finally, we enabled full backup and restore system so despite failures, we can now always restore data.

We also learned that for rapid transactional writes of impression data, the default ISAM storage engine was inadequate (table locking). So, we migrated to InnoDb storage engine (row-level locking) and had row-level contentions at scale. This prompted us to use Percona's xtradb to allow rapid and multiple updates to the same row. However, our next bottleneck was our OpenX single master database deployment and we moved to OpenX Distributed Statistics.

Over time, we enabled a very detailed and elegant monitoring and alerting system using nagios and munin which allows us to be alerted in seconds, when failures do happen. We also got better at simply assuming that failures will happen so that the final system designs were more fault-tolerant and more redundant. For example, we setup a duplicate of full ad rendering system in a separate location (using AWS Availability Zone).

Phase 3: CPM Pacing Problem

Once substantial number of campaigns with varying parameters were loaded into the OpenX Ad Server, we started noticing that sometimes certain campaigns were over or under paced. Worse, sometimes, blank ads were served. The latter caused embarrassment all around.

So, we upgraded to OpenX 2.8.3 hoping that it will improve the pacing but it did not. When we talked to OpenX team, they recommended we abandon using OpenX Community Download version and move on to OpenX Enterprise (which they host). We tested pacing on their enterprise version and except for startup hiccup in first hour or two, it did remarkably better than their open-src version. However, we could not rapidly move a running network to the OpenX Enterprise as a) we found compatibility problems between our OpenX version and hosted version (In our a year, when we had found bugs/limitations in OpenX API, we had made direct calls to the OpenX database), b) getting data out of hosted OpenX was non-trivial work. By the way, we think very highly of the OpenX team and wish them all the success!

So, we went ahead with our own pacing solution (Project Olympus) which we grafted into OpenX 2.8.3(Community version) in the banner selection code path. This way, when a creative request came, we will look up in memory-based tables which campaign/creative made the best match and returned a good match. Rest of OpenX was used as is (e.g. logging the request). Using this system, we were able to manage the campaigns at an even pace while maintaining daily and overall target requirements put in by the AdOps team. Yes, we also handled frequency capping and geographic restrictions. Without spilling the beans, this was a best of breed design and it worked at scale!

Phase 4: Business Requirements Change Again

Due to market pressure, we had to rapidly become really large marketplace for publishers and advertisers. In order to do that we needed to latest and best of Ad Server features (e.g. Professional grade AdOp management and Reports, Real Time Bidding, Integration with DSPs, etc) and market place features (e.g. find advertiser and publishers outside our network).

In this phase, we are moving to a fully functioning Ad Server/Marketplace which has all these features and more. So, we can focus on bringing transparency and control to our publishers. More on this in later posts.