Monday, November 02, 2009

Firefox 3.6 has some interesting featureset

Firefox 3.6 Beta 1 have some really interesting set of new features. Read this page for the full feature list, I have reproduced an excerpt that I found interesting:

DOM
Drag and drop now supports files
The DataTransfer object provided to drag listeners now includes a list of files that were dragged.

Detecting device orientation
Content can now detect the orientation of the device if it has a supported accelerometer, using the MozOrientation event; see window.onmozorientation for details. Firefox 3.6 supports the accelerometer in Mac laptops.
For XUL and add-on developers

If you're an extension developer, you should start by reading Updating extensions for Firefox 3.6, which offers a helpful overview of what changes may affect your extension. Plug-in developers should read Updating plug-ins for Firefox 3.6.

New features

Detecting device orientation
Content can now detect the orientation of the device if it has a supported accelerometer, using the MozOrientation event; see window.onmozorientation for details. Firefox 3.6 supports the accelerometer in Mac laptops.

Monitoring HTTP activity
You can now monitor HTTP transactions to observe requests and responses in real time.


Sunday, September 06, 2009

Web Scalability & Performance: Real Life Lessons

Following is a presentation that I made at TechWeekend in Pune on 5th September. About sixty hard-core technical geeks were present at the sessions. Following is the presentation that I made. Feel free to share.Web Scalability & Performance
You can reach me on Twitter @mukulneetika .

Labels: , , , , , , , , , , , , , ,

Saturday, April 25, 2009

8 cores and above - Is the race worth it?

AMD has announced plans to beat Intel to 12 cores, releasing both 8 and 12 core processors, codenamed Magny-Cours, in Q1 2010.  It has also announced that it will in 2011 roll out its 32 nm Bulldozer core, which will feature up to 16 cores, running on the new Sandtiger architecture.  In short -- AMD plans to beat Intel in the core race.

You may note that Engineers at Sandia National Laboratories did simulation of 8, 16 and 32 cores, and have opined that performance of multi-core machines would level off or even decline beyond 8 cores,due to limited memory bandwidth.


Read more here.

At the heart of the trouble is the so-called memory wall—the growing disparity between how fast a CPU can operate on data and how fast it can get the data it needs. Although the number of cores per processor is increasing, the number of connections from the chip to the rest of the computer is not. So keeping all the cores fed with data is a problem. In informatics applications, the problem is worse, explains Richard C. Murphy, a senior member of the technical staff at Sandia, because there is no physical relationship between what a processor may be working on and where the next set of data it needs may reside. Instead of being in the cache of the core next door, the data may be on a DRAM chip in a rack 20 meters away and need to leave the chip, pass through one or more routers and optical fibers, and find its way onto the processor.

I would really love to run the MT tests that would show the performance at 6 and 8 core. Searching ...

Sunday, April 12, 2009

Go for that “Impulse Purchase”

‘Planned purchase’ is something I have been doing for past several years. It is generally believed that as you “mature” you do more planned purchases versus impulsive purchases. Impulsive purchase has been linked with the “impulsive youth”. Sometimes is has also been associated with immaturity. While, planned purchase is something I have generally been very proud of, I have lately realized the downside of this approach, and I am beginning to think if it is a bad thing.

Planned purchase is invariable associated with – a) massive market research, b) massive product research, c) understanding your own requirements, d) deciding the exact price point for best price performance ratio, e) sometime getting into the microscopic details of the product, for differentiation in a commoditized product, technology or service. All this might take anywhere from a few days to a few months – eventually leading to a “better purchase”.

When you buy something, one of the important things that must happen after the purchase is – “you must derive great sense of satisfaction from your purchase”. I have come up with an approximate relationship between satisfaction and the research you do on the product, the relationship is:


So, I am postulating that the satisfaction is inversely proportional to the amount of research you do on the product. The more you do research, the less satisfaction you get after doing the purchase.

If you are buying an electronic gadget, many times, there are several less documented small features that come in with a gadget – like a quick-access button in a smartphone, a hidden pocket in a backpack or the cool 3.2MP webcam in a laptop. These features almost appear like “serendipity” when you accidentally find it, you feel pleasantly surprised in your new gadget. All that surprise is gone when you have done 4 months of study on all the smartphones available in the market. I feel that this is a big bummer.

An extension of the above rule is that satisfaction is directly proportional to the ‘Impulsiveness of Purchase’. I think there are ample examples of my friends who have purchased stuff with little research, and have been immensely satisfied with their purchase.



What do you think?

A caveat to these rules will be – don’t apply this on large sized purchases. Like don’t apply this when buying a house or a car ?

So, guess what, as of today, I am shifting my slider from the ‘planned purchase’ to the ‘impulsive purchase’ side.

Tuesday, April 07, 2009

Analysis: Which URL Shortening Service Should You Use?

Danny Sullivan has a written a great writeup on comparing different URL shortners. The article is really interesting, you should read it here.

Danny presented this table in the article, which is the key document in this article. Check it out.

I also did a quick search of the popularity of each of the URL shortening service. Here are the results:

tinyurl.com 23,000,000
bit.ly 3,840,000
is.gd 3,160,000
tr.im 480,000
cli.gs 144,000
snurl 109,000
kl.am 8,810
short.ie 6,060

Thursday, March 05, 2009

Mikz - How different is it from httpd4mobile and mymobilesite?

TechCrunch reports this story today:
Conveneer, a Swedish mobile startup with offices in Lund, Sweden and Palo Alto, California, closed a $4.5 million venture round, led by the Swedish foundation Industrifonden. Broken Arrow Venture Capital also participated. The company previously raised seed money from the founders and Teknoseed. Conveneer is building a mobile platform called Mikz, which will be able to assign a URL to your mobile phone, making the content on your phone accessible on the Web. In essence, it turns each mobile phone  into a Web server. Once your phone has a URL like http://joe.mikz.me, other Web applications and services can ingest the data that is locked in your phone, and also your phone can take advantage of common Web APIs. Mikz can pull information off your phone such as your contacts, GPS coordinates, photos, music, ringtones, and other files. It creates a Web interface for your phone.
I haven't gone too deep into how they do it, but I see that httpd4mobile also does something similar. Httpd4mobile is an HTTP mobile server for Java J2ME mobile devices that enables you to access and control various features such as Camera Picture function, Audio Record function, Contact List, SMS Send, File Download, File Upload, etc.

See also mymobilesite that works on Nokia S60 devices, that allows users to create, share and access contacts, calendar appointments, SMS text messages, emails, phone logs, share pictures, etc.

What do you think?

Wednesday, March 04, 2009

Microsoft Planning Ad-Supported Model For Office 14?

Money.cnn.com reported:

An ad-supported Microsoft Office 14?

That's what Microsoft Business Division Chief Stephen Elop said was coming at a presentation to analysts at the Morgan Stanley Technology conference today.

Here are my questions :-)

Is this the online-version of office? Would Microsoft do contextual analysis of the documents to put the ads on Office 14? If yes - then the content is no longer private; Microsoft would know all the content (maybe a software pirate would care less). If not - how would they put ads, how would they target a user?

Looking at the top 20 countries that have highest piracy rates, I can't think if there can be relevant ads for those geographies. See the list yourself here.

Interesting.

Contribute to a Map-Reduce job by simply pointing your browser to a URL

Igvita.com has published an interesting way of running Mapreduce jobs:
After several iterations, false starts, and great conversations with Michael Nielsen, a flash of the obvious came: HTTP + Javascript!

What if you could contribute to a computational (Map-Reduce) job by simply pointing your browser to a URL? Surely your social network wouldn't mind opening a background tab to help you crunch a dataset or
two!

Instead of focusing on high-throughput proprietary protocols and high-efficiency data planes to distribute and deliver the data, we could use battle tested solutions: HTTP and your favorite browser. It just so happens that there are more Javascript processors around the world (every browser can run it) than for any other language out there - a perfect data processing platform.
Some really interesting comments in there too. Read full text here.

Saturday, February 28, 2009

10 Interesting Articles for the Weekend Read

I found the following articles pretty interesting, check them out:

Startups in 13 Sentences - by Paul Graham

How FriendFeed uses MySQL to store schema-less data

Giz Explains: Why Lenses Are the Real Key to Stunning Photos

Cellular providers want Nokia to drop Skype from cell phones - Two cell service providers in the UK are supposedly up in arms over Nokia's inclusion of Skype software on its N97 handset, and are threatening not to carry the device unless the software is ditched. This stance is not only annoying to consumers who are beginning to like VoIP, but it could also even hurt the carriers' business in the long run.

The Big List Of Search Engines & Their Employees On Twitters

How to Buy Domain Names Like a Pro: 10 Tips from the Founder of PhoneTag.com

10 Free CAPTCHA scripts and services for websites

OCRTerminal - free online Optical Character Recognition service that allows you to convert scanned images and PDF's into editable and text searchable documents.

10 Papers Every Programmer Should Read (At Least Twice)

15 online photo editors compared - compares Flauntr, Fotoflexer, Lunapic, Phixr, Phoenix, Photoshop.com, Picnik free, Picnik premium,
Picture2Life, Pixenate, Pixer.us, Pixlr, Snipshot, Snipshot Pro and Splashup.

Y Combinator’s FathomDB looks pretty neat

FathomDB provides relational databases under the utility/service model. They say that they automate the low-level DBA tasks (backup/monitoring); and also provide performance analysis tools that facilitate the high level DBA tasks.

Final pricing for the service is still being determined, but the company plans to charge a small (~10-20%) markup over standard EC2 prices.

I think this is pretty neat. It's a great trend. I hope they don't have any 'center point of failure', I hope their monitoring resources and backup locations are widely distributed. I think DB backup brings in a lot of value.

This screenshot at Techcrunch looks pretty interesting:


Friday, February 27, 2009

Carol Bartz on 'getting the house in order'

Carol Bartz, CEO Yahoo! writes on her blog:
Today I’m rolling out a new management structure that I believe will make Yahoo! a lot faster on its feet. For us working at Yahoo!, it means everything gets simpler. We’ll be able to make speedier decisions, the notorious silos are gone, and we have a renewed focus on the customer. For you using Yahoo! every day, it will better enable us to deliver products that make you say, “Wow.”

Impressive.

The reorg at Yahoo! has been discussed in the following articles:

Thursday, February 26, 2009

Microsoft to use Machine Learning software to put servers to sleep when not in use

SAI reports:

Microsoft is working on new tech codenamed "Marlowe" to build data centers using low-power servers that can intelligently "sleep" and "wake up" -- just like a portable computer.

NY Times: The company has applied sophisticated machine learning software to the Atom-based servers and tracked how they handle search requests on Microsoft Live over the course of a day.

When the software senses a lull in action, it can place large numbers of servers into sleep or hibernate modes so that they consume just 2 to 4 watts instead of the usual 28 to 37 watts. Then, in an ideal set-up, the software can anticipate when more active periods will resume and begin waking up the servers ahead of the incoming search requests. It usually takes the servers about 5 to 45 seconds to jump back into action.

I think this is a great idea, if you can predict the load on the servers, and if you datacenter is already not optimized. With hundreds of cron-jobs running at asynchronous intervals, DB and file systems deciding on their own schedule of page-flushes, and with server load coming from around the globe - it might be a challenge to accurately predict the load. But, that's what software is for.

Watch this space.

'AWS Public Data sets' has full Wikipedia available in TSV format

'Amazon Web Services Blog' reports that the AWS public data sets has the Wikipedia Extraction (WEX), which is a processed, machine-readable dump of the English-language section of the Wikipedia. At nearly 67 GB, this is a handly and formidable data set. The data is provided is the TSV format as exported by PostgreSQL.

There are a number of other data sets also available, read more here.

They also describe how easily you an use these data sets:
Instantiating these data sets is basically trivial. You create a new EBS volume of the appropriate size, basing it on the snapshot id of the data. Next, you attach the volume to a running EC2 instance in the same availability zone. Finally, you create a mount point and mount the EBS volume on the instance.

Awesome.

Google Apps Status Dashboard looks pretty good

Google has finally come up with a status dashboard. I had previously reported on other Web Services that do similar status reporting, such as http://status.aws.amazon.com/ , http://status.mosso.com/ and http://heartbeat.skype.com/.

Looks great - as long as all the check boxes stay ticked.



Nokia to enter laptop market?

jkOnTheRun reports this story -
Reuters is reporting that Finnish handset giant Nokia has admitted they are considering entering the laptop market.  In an interview in Finland CEO Pekka Kallasvuo was asked if Nokia plans to make laptops and had this response:

“We are looking very actively also at this opportunity…  We don’t have to look even for five years from now to see that what we know as a mobile phone and what we know as a PC are in many ways converging.”


Nokia also reported earlier that they would shrink production and R&D as sales tank, and rapidly shrinking market-share in the smartphone market. Netbook is a growing market, no wonder Nokia wants to get a portion of the pie.

Wednesday, February 25, 2009

CDNetworks Acquires Panther Express To Speed Expansion In The U.S.

Dan Rayburn does a detailed writeup on this story here:
This morning, CDNetworks announced that it has acquired Panther Express. Headquartered in NYC, privately held Panther Express has been in the content delivery business since 2005 offering HTTP based delivery services in the U.S and Europe.

Panther's footprint gives CDNetworks quicker access into North America and Europe and allows them to ramp sales much faster. What Panther Express lacks is the reach into Asia, the ability to support streaming media services, including live delivery and access to a large sales and marketing force. CDNetworks has the footprint in Asia, supports streaming of Flash, Silverlight and Windows Media live or on-demand and is an organization of over 400 employees after the inclusion of Panther's team.


PubMatic Launches Open, Real-time Monetization API for Ad Networks and Publishers

PubMatic's API Allows Ad Networks to Instantly Access Premium Publisher Inventory Resulting in Increased Revenue for Participating Publishers

PubMatic (www.pubmatic.com), an ad revenue optimization company that works with over 5,500 online publishers, announced today the official launch of it's highly anticipated Application Programming Interface (API). The API will increase revenue for both ad networks and PubMatic's publisher clients by allowing an instant and transparent connection between them for single or multiple ad campaigns on demand. The official API launch comes after a successful closed beta period with multiple ad network partners.

Benefits for Ad Networks:
  • Increased Reach: Access to over 125 million unique users and thousands of websites.
  • Improved Targeting: Ad networks can leverage expanded targeting options including geography, frequency, user re-targeting, ad tag type, and much more.
  • Campaign Control: Ad networks have total control over pricing and timing for campaigns
Benefits for Publishers:
  • Increased Monetization: Allowing more ad networks to instantly access publisher inventory increases publisher ad rates and sell through rates.
  • Increased Visibility: Dozens of new ad networks can get instant and transparent access to publisher inventory.
  • Better User Experience: More premium campaigns and better targeting result in users seeing higher quality and more relevant advertising.
  • Zero Integration: PubMatic publishers have instant access to premium campaigns and ad networks via the API, with zero integration effort.
"Efficiency is absolutely key to improving monetization in the online advertising ecosystem," said PubMatic CEO, Rajeev Goel. "Our API was designed to seamlessly connect publishers and ad networks so they can have total control of their campaigns, which results in an easy, streamlined process that is financially beneficial to both parties. This is the future of online advertising."


Google Announces Pricing for App Engine

RWW reports:

Here is the new pricing scheme according to Google's blog post:
  • $0.10 per CPU core hour. This covers the actual CPU time an application uses to process a given request, as well as that for any Datastore usage.
  • $0.10 per GB bandwidth incoming, $0.12 per GB bandwidth outgoing.  This covers traffic directly to/from users, traffic between the app and any external servers accessed using the URLFetch API, and data sent via the Email API.
  • $0.15 per GB of data stored by the application per month.
  • $0.0001 per email recipient for emails sent by the application
In general, Google's prices seem to be slightly cheaper and less complicated than Amazon's pricing schemes for using its EC2 and S3 service. It should be noted, however, that Amazon offers a far larger feature set than App Engine. App Engine only supports the Python programming language, while EC2 gives you access to a complete, remotely hosted, on-demand operating system.

Also see previous post on Comparing Clouds: Amazon EC2, Google, AppNexus, and GoGrid.

Safari 4 benchmarked: 42x faster than IE 7, 3.5x faster than Firefox 3

Holy Jesus!

CNET UK reports ...

Proving itself a staggering 42 times faster at rendering JavaScript than IE 7, our benchmarks confirm Apple's Safari 4 browser, released in beta today, is the fastest browser on the planet. In fact, it beat Google's Chrome, Firefox 3, Opera 9.6 and even Mozilla's developmental Minefield browser.

20 million Chinese sites are served by QZHTTP not Apache

The following report talks about 20 million Chinese sites are served by QZHTTP not Apache.

Also, Qzone blogging service makes the company (QQ) the largest blog site provider in the survey, surpassing the likes of Windows Live Spaces, Blogger and MySpace.

Following is a report on number of web sites found for each server product found on this web page.


Netcraft Web Server Survey - http://survey.netcraft.com/Reports/200902/

Just Start Pitching: The infallible Sales Pitch

I recently made a trip to two electronics stores that also sold Laptops. The first is called X-cite the second ‘House of Laptops’. I was looking for a ThinkPad, but both these places did not have the Lenovo ThinkPad with them. X-cite and ‘House of Laptops’ have very marked difference in which they approach customers.

At X-cite, when I didn’t find the ThinkPad the sales guy approached me with the question – “maybe I can help you find some other laptop; if you can only tell me your specifications, features and cost preferences”. I thought about it for a few seconds, and then I left the store.

At ‘House of Laptops’, when I didn’t find the ThinkPad the sales guy immediately started pitching me a new model from Toshiba. I have never bought a Toshiba laptop and there was very little likelihood that I would buy a Toshiba. However, the sales guy totally overwhelmed me with the features of the slick Toshiba model. It had the latest Core 2 Duo, and all the gizmos that you can ask for. It was 1.8kg and priced at Rs. 49,000, with some freebies. And it had spill proof keyboard.

The first guy’s approach was probably more scientific, in terms of collecting customer requirements and then providing the customer with a solution. However, the customer walked off. The second guy just started pitching, without understanding the customer's requirements. Obviously the second approach is much better, there are much more chances of selling a laptop with the second approach. I think there is learning in this.

What do you think?

Tuesday, December 23, 2008

Contextually Relevant Ads from Google AdSense that made me click

Today I found an ad that I readily clicked. This is an ad about weekend trip to Lonavala. Such a relevant ad and such great timing. I clicked on the ad, and it actually kept it's word of the pricing at Rs. 3568. Good job Google AdSense!

Following is the contextual ad. I did a little a little more digging, and found 2 more interesting things:
  • Google AdSense click URL shows up in Chrome (on mouseover), that doesn't show up in any other browser - kind of cool
  • Google AdSense URL is now pointing to http://googleads.g.doubleclick.net/pagead/ , while earlier it used to point to http://pagead2.googlesyndication.com/pagead/ . Interesting. Obvious integration with DoubleClick; but is that some kind of 'lead generation' ?
Following is the contextual ad.

Labels: , , , , , ,

Friday, December 12, 2008

Some fun with Google Hindi translation!

Google translation for Hindi is pretty cool. I test drove it, then I got some ideas, here they are, have fun:

English: Google translation rocks!
Hindi: Google अनुवाद चट्टानों!

English: Google translation is cool!
Hindi: Google अनुवाद ठंडा है!

English: Google translation is a kick-ass product!
Hindi: Google अनुवाद एक लात-गधा उत्पाद है!

English: Google translation is a super awesome product!
Hindi: Google अनुवाद एक सुपर भययोग्य उत्पाद है!

Click here to try more stuff.

Labels: , ,

Wednesday, December 10, 2008

Top 15 Posts last week, which are still fresh

Following are my top 15 posts on Twitter last week. Check them out:

Komli is Red Herring's top 100 Asian promising startups for 2008.
Red Herring Asia 100 recognizes the 100 “Most Promising” Asian Companies Driving the Future of Technology. Red Herring announced that 27 out of the 100 winners of the Red Herring 100 Award are from India. This is quite a high number given that China, Japan, Singapore and Malaysia were some of the other countries that were in the list as well.
Read more here.

YourBillBuddy.com is an amazing website that checks your phone bill and recommends the best mobile plan based on ur usage. Check it out here.

Driveme.in lets users view streets with driving like experiences, share them, find & explore favorite places
Driveme.in is a new startup from Pune that let users view the streets with driving like experiences and can also let them share, find and explore their favorite places online. One thing to clear here in this application is that it has nothing to do with Google maps or its API support. Its completed independent application.
Check it out here.

Indian Media Takes To Twitter
Following the siege in Mumbai which brought the Twitter and its usage by citizens to share and spread information into the limelight, media publications including Mint and DNA have signed up for Twitter. While media publications on Twitter are not new - New York Times, Wired, The Economist and the Wall Street Journal have their own twitter feeds.
Read more here.

Almost No Web Users Would Pay To Remove Ads
When we asked consumers if they would pay $39.99 a year, which comes out to less than $4 a month, for an ad-free version of one of their favorite sites, only 2.4% said definitely yes, they would be likely to do so. And only 3.5% said they'd be very likely. In fact, 84% of the people said they'd be unlikely or not at all likely.
Read more here.

10 of the Most Fuel-Efficient Cars in the United States. Read more here.

Top 10 Tips To Get Your Startup Noticed. Simple, easy and cheap ways of marketing your startup. Read more here.

Performance of multi-core machines would level off or even decline beyond 8 cores,due to limited memory bandwidth

Read more here.

Why Auto-Scaling In the Cloud could be a Bad Idea? Read more here.

Will VC's become irrelevant - totally Awesome Post by Paul Graham
VC funding will probably dry up somewhat during the present recession, like it usually does in bad times. But this time the result may be different. This time the number of new startups may not decrease. And that could be dangerous for VCs.
Read more here.

"No evidence from last 10 years that users want Indian languages" says Ajit Balakrishnan, CEO Rediff.com
Rediff has email in 11 languages, and 99% of the users prefer to use email in English. One of the issues is that “practically all of the 300 million young people who aspire to something in this country aspire to learn English.” Therefore “Let us not assume that users want Indian languages.”
Read more here.

"BlackBerry Storm, by far the worst product Research in Motion has ever produced". Read more here.

Are rounded corners going away? - "Square is the new round."
Google Reader changed their UI - out with the old rounded corners, drop shadows and heavily saturated colors -- in with a softer palette, faster components and a fresh new look.
Read more here.

Google Apps SLA redefines downtime - “Downtime Period” means a period of ten CONSECUTIVE minutes of Downtime
Read more here.

New Android Phone Debuts, Looks Like a Blackberry. Comes unlocked at $256 US; looks pretty COOL
Read more here.

Firefox malware collects logins & passwords of banking sites, forwarding that information to a server in Russia. Read more here.

Web Scrapers win hundreds of auctions at eBay Holiday Contest at $1 . Read more here.

Follow me on Twitter here.

Labels: , , , ,

Tuesday, December 09, 2008

Yahoo! Email Ads still can’t get geo-targeting right

Yahoo! Email continues to serve me US ads, wasting money on those high eCPM expandable rich media ads.

What a pity. Yahoo! knows that I live in Pune, India; it can tell the temperature of the city, but cannot geo-target the right ads for me.



There are so many other ad-servers that get it right, why can’t Yahoo! get it right?

Thoughts?

Labels: , , , ,

Tuesday, December 02, 2008

Komli Launches ViziSense: India’s First Free and Open Online Audience Measurement Platform

I’m very excited today to announce the launch of Komli ViziSense, the first free and open audience measurement platform for India that accurately reports details of site visitor demographics and other audience characteristics. ViziSense helps publishers with an independent measurement system, enables advertisers to access and understand their online audience with precision, and allows ViziSense agencies to plan better media buys.

Read more details here and here.

Some screen shots below:


Labels: , , , , , ,

Saturday, November 22, 2008

5 tips for startups for handling the power situation

Recently we have had a few power outages in our office at Pune. After about 3 days of facing the crisis, I had to find a solution for this problem. Following are 5 tips for startups for handling the power situation. These tips are specifically for early stage startups.

  1. USE LAPTOPS INSTEAD OF DESKTOPS – this is pretty much a no brainer. Laptops consume about 100 watts of power, while desktops consume 500 watts. Apart from the other obvious benefits I recently found out that – you can pretty much lie down on a bean-bag and still work on a laptop. This was after we bought a few bean bags in the office.

  2. USE VIRTUAL MACHINES – this is one of the coolest ideas. Instead of buying 4 low powered machines buy one powerful machine and make 4 virtual machines out of it. So, an example configuration would be 2 x Quad-Core Intel® Xeon® processor E5410 + 16GB RAM + 1TB HDD – split it 4-ways to make into 4 powerful 2-core servers. Pretty awesome, 4 physical machines would consume 2000W power, while the 2xXeon would consume about 500-600W.

  3. INSTALL A “TRUE ONLINE SINEUPS” – these are much more powerful than regular inverter/UPS. They are costly, yet worth every penny. This has much more battery power than a regular inverter/UPS, and can last much longer. They come in 6KVA, 8KVA and 10KVA, on the smaller side.

  4. SEPARATE OUT THE BACKUP FOR SERVERS, laptops and fans/lights - The problem with that is – it adds another point of failure because of more failing parts. You don’t want to hear UPS’ beeping all day in your office.

  5. MOVE TO A FULLY FURNISHED OFFICE that has generator backup – These cost upwards of Rs. 70 per sqft. Pretty high cost for a startup.

To summarize, tips 1, 2 and 3 are the really feasible options.

Got more ideas? Have questions? Send me a message on facebook or Twitter or send me an email at mukul dot kumar at pubmatic dot com .

Labels: , , , , , ,

Wednesday, November 19, 2008

The Real Cost of Amazon CloudFront

Amazon introduced web service for content delivery, called Amazon CloudFront yesterday. CloudFront is thought to be a bring a pricing war between the current CDN providers.

If you do a little bit calculations for the real cost of the CDN, it turns out that it is much higher than the advertised pricing, for smaller files.

Following is the effective cost per GB for USA locations for CloudBurst. For example - if your files are 5KB in size, you will actually pay $0.3797 per GB not $0.17 . If your file size is 10MB, then you will pay the advertised price of $0.17 per GB. So, essentially if you are distributing images or movies, CloudFront will be cost effective, however if you are distributing JavaScripts of small size, you may be paying a lot more.

File Size (KB) Effective Cost per GB
5KB $0.3797 (123% more)
10KB $0.2749
20KB $0.2224
50KB $0.1910
100KB $0.1805
500KB $0.1721
1MB $0.1710
5MB $0.1702
10MB $0.1701
100MB $0.1700
1GB $0.1700

Here is how I calculated this - for US locations, data transfer rate (for first 10 TB / month) is $0.17 and request rate is $0.01 per 10,000 requests.
Effective cost per GB = $0.17+(1024*1024/file_size * 0.01/10000);

Labels: , , , ,

Saturday, November 15, 2008

The world’s most super-designed data center

Royal Pingdom reported: The newly opened high-security data center run by one of Sweden’s largest ISPs, located in an old nuclear bunker deep below the bedrock of Stockholm city, sealed off from the world by entrance doors 40 cm thick (almost 16 inches).

Facts about the data center:
  • Originally a nuclear bunker
  • Located in central Stockholm below 30 meters (almost 100 ft) of bedrock
  • Can withstand a hydrogen bomb
  • Houses the Network Operations Center for one of Sweden’s largest ISPs
  • German submarine engines for backup power
  • 1.5 megawatt of cooling for the servers
  • Triple redundancy Internet backbone access
  • Work environment with simulated daylight and greenhouses
Following is one of the photographs, courtsey Royal Pingdom:


Read more here.

Labels:

Monday, November 10, 2008

VMware Wants to Bring Virtualization to Smart Phones - BUT where are the use-cases

eWeek.com reports that VMware is going to release a virtualization platform for mobile phones, but gives no real use-case for such a platform. Later in this article they talk about the use-case of creating different profiles in different VMs; isn't that possible today using different profiles on a Windows based operating system, why would anybody create a separate virtual machine to create a different profile? I could imagine (hypothetically speaking) running an iPhone inside a RIM phone, or running an S60 app inside an iPhone - but is that a real use case.

Here are parts of the article:

"VMware is looking to bring its virtualization technology to smart phones and cell phones in 2009 through a new virtualization platform called the VMware Mobile Virtualization Platform, or MVP. The platform will use a small-footprint hypervisor that will allow users to have multiple virtual machines on their smart phones in the same way a desktop or notebook can host different virtual environments. Mobile Virtualization Platform, or MVP, which consists of a small, bare-metal hypervisor— 20KB to 30KB —that will work with a number of mobile devices based on an ARM processor.

“This virtualization layer that we have is just like the one on the server and desktops, and it will allow customers to run multiple virtual environments on the phone,” said Krishnamurti. “We think there are some interesting use cases. One is that many people have one phone for work and another is a personal phone. With virtualization, you can have one device that runs both environments in two isolated virtual machines. The work profile and the personal profile are completely separated.”

Right now, the VMware MVP platform will support a number of mobile devices based on Linux, Windows CE and Symbian, which is now owned by Nokia. Later, Krishnamurti said, VMware will add support for Google’s Android operating system."

Labels: , , ,

Friday, November 07, 2008

Cloud Computing: How important is "data locality" from a costing perspective?

Nicholas Carr wrote an excellent article about cloud computing "The new economics of computing".

"In late 2007, the New York Times faced a challenge. It wanted to make available over the web its entire archive of articles, 11 million in all, dating back to 1851. It had already scanned all the articles, producing a huge, four-terabyte pile of images in TIFF format. But because TIFFs are poorly suited to online distribution, and because a single article often comprised many TIFFs, the Times needed to translate that four-terabyte pile of TIFFs into more web-friendly PDF files.

Working alone, he uploaded the four terabytes of TIFF data into Amazon's Simple Storage Service (S3) utility, and he hacked together some code for EC2 that would, as he later described in a blog post, "pull all the parts that make up an article out of S3, generate a PDF from them and store the PDF back in S3." He then rented 100 virtual computers through EC2 and ran the data through them. In less than 24 hours, he had his 11 million PDFs, all stored neatly in S3 and ready to be served up to visitors to the Times site.

The total cost for the computing job? Gottfrid told me that the entire EC2 bill came to $240. (That's 10 cents per computer-hour times 100 computers times 24 hours; there were no bandwidth charges since all the data transfers took place within Amazon's system - from S3 to EC2 and back.)"

One thing missed in the "NYT TIFF to PDF conversion computational task" is the mention of data transfer cost of uploading 4TB TIFF images into S3.

Doing some simple computations – Amazon would charge about $409.60 for uploading 4TB data into S3, and would charge an additional $261.12 for downloading the processed PDF files, which were 1.5TB in size. That is about $670.72. In addition there will be bandwidth charges of this 5.5TB data transfer from the NYT datacenter, 4TB out and 1.5TB in, I am sure that will be of the order of $400-$600. That could take the data transfer costs to $1000-$1200 range.

In addition to that – consider the amount of time it would take to transfer such a data. At 10Mbps, it would take 53.4 days to transfer this data.

Using Hadoop on EC2 is definitely a great idea, and is very helpful, however the locality of data also matters a lot. Moving data, in my opinion costs a lot, and sometimes undermines the computational costs ($240 here).

Let me know your thoughts.

Labels: , , , , , , ,

Sunday, November 02, 2008

Finding and fixing memory issues using Valgrind with example of Apache+FastCGI web application

My colleague, Anand Das, has written an excellent article on using Valgrind to debug live web applications. Read the full article here.

Labels: , ,

Saturday, October 11, 2008

Is ‘Divide and Conquer Algorithm’ a human instinct?

Last Friday I was with my son at a restaurant, I was browsing the menu, when I thought I will give my son a quick problem, and see how far he reaches in solving it. The menu had 5 pages, each page had about 20 items.

I asked my son if he could find the ‘cheapest’ and the ‘costliest’ items in the entire menu.

My son is 6 years old, and does some very basic mathematics at the school, such as arranging numbers in ascending and descending orders, simple single-digit additions and subtractions, etc.

I was amazed to find that, my son quickly found the cheapest item on page #1, then the cheapest item on page #2, chose the lesser of two ( MIN(page#1,page#2) ), and then recursively did that for pages 3, 4 and 5. And gave me the cheapest item on the menu (Rs. 6). He then did the same for costliest item, he found the costliest item on page #1, then the costliest item on page #2, chose the higher of two ( MAX(page#1,page#2) ), and gave the answer (Rs. 95). [We were at a Udipi restaurant, so things are not that costly here.]

I never taught him the ‘divide and conquer algorithm’, in fact I have never taught him any algorithm.

Is ‘Divide and Conquer Algorithm’ a human instinct? Does it come without learning?

Thoughts?

Labels: , , ,

Monday, October 06, 2008

5 Hacks for Startup Hiring in India

Here are some thoughts on hiring for a Startup in India. My experience with hiring in India for the last fifteen years, in one word, has been – “Awesome” ! In Pune I have met some of the best Programmers and Designers in the world and work with many of them. There are some of most hardworking, smart and knowledgeable individuals, who love to crank code (read an interesting post here, under the “People” section). I love working with the great guys at Komli!

Hiring in India is different than hiring in other parts of the world. The following thoughts are written for an employer in mind, especially a startup employer. These thoughts are in a random order, and based on personal experiences. Please don’t equate my consistent use of ‘he/him’ with a gender bias.

1. “Offer acceptance is not equal to JOINING”

This is something you learn the hard way. It is very difficult to believe that a candidate talks so nicely, and accepts your job offer, only to NOT show up on the joining date. This is a shocker, which takes several days to recover from. If the candidate is good he calls up/e-mails you a few days in advance, telling you that he cannot join. Many will not inform you, and simply won’t show up on the joining date.

My recent experience – “a senior managerial candidate, who was relocating from the USA to India, accepted the offer after several negotiations that went on for weeks. He was very happy and I was very happy, that we have a deal. The day he was supposed to land in Pune, and join after a few hours of landing – he didn’t show up. I patiently waited until the evening, and next morning. Emailed him, and found out that instead of flying into Pune, he landed into Bangalore, and joined another large company yesterday. How nice.”

Accept the fact – a hire is only a “probability” until the day he shows up. This probability increases as the date of joining comes near. An offer acceptance on e-mail, or in hardcopy are still probabilities of joining.

The way I would handle this is – a) don’t count on a hire until he joins, b) always plan for backups – no hiring is complete until the last guy joins, c) keep calling the candidate every few days, to find out if he is going to join – if he tells that he is not joining, it’s better to know that early on, rather than on the last date, and d) if the numbers are large – over-hire, to compensate for the probability.

2. “The Resume”

I have found that many resumes have inaccurate information in them. You can actually build a “probability-framework” on what percentage of a resume is true – based on some of the key parameters of the resume – such as skill-set (Java, PHP, RoR, ASP.NET, C/C++, C#, etc). Try that, it works.

The way I handle this is – talk to the candidate, find out what he has done. Correlate that with his resume. Most of the times they match.

Another interesting parameter is the – “keyword density”. My personal experience has been that the higher the keyword density, the more likely is that the candidate is bogus. You cannot learn – all of “C, C++, Java, PHP, MySQL, Oracle, OLTP, Apache, Tomcat, RoR” in 2 years :-)

3. “The Sourcing”

Sourcing of resumes has a major impact on the success rate. I think it is very important to access the success rate of each of the source of resumes – direct, referral, recruiters, newspapers, online portals, etc. You may be surprised to know that there may be a difference of 10x in the conversion rates of each of these sources – so you should focus on the source that has the highest conversion rate.

For startups referrals work the best. Keep your employees happy, so that they find more friends who want to become happy!

4. “The Interview”

A few things at the top of my head are following:
Do initial screenings before you go too deep into technical discussion. If the candidate is not good, let’s find out in the first ten minutes of discussion, so that you save time on both sides. One important thing in my mind is – ask questions about your most recent problem that your company is facing, find out if he can solve that problem or not. Even if a guy can solve the most complex algorithm problems, or he can do the most optimal data structure design – can he solve your current (or past two-three) problems? Make sure you factor that into the overall decision. Don’t compare the candidate to yourself – “he is not like me; I can do it better than him”. It is very difficult to find a guy better that yourself, don’t try that :-)

5. “The Timing”

Try to keep “good” interviews at the top of the day, during mornings. You are in the office at 9AM, if the candidate doesn’t show up, or doesn’t pickup the phone – that does very bad things to your day. It’s a difficult thing to do, but I try and keep most interviews at the later part of the day.

Well, that’s all I have for now. There are many more things, but I wanted to keep it simple.

Got any more ideas, send me a message on facebook or Twitter?

Labels: , , , , , ,

Wednesday, September 17, 2008

Why doesn't Google Chrome render my bullets right?

Why doesn't Google Chrome render my bullets right?

See the rendering of bullets using FF, IE7 and Chrome below:

Labels: , , ,

EOS 5D Mark II Digital SLR camera - Is this the new future proof SLR Camera?

Is this the new future proof SLR Camera?

EOS 5D Mark II Digital SLR camera, the long-awaited successor to Canon's highly popular EOS 5D, introduced in 2005 [original post here].
  • 21.1-megapixel full frame, 24 x 36mm CMOS sensor, DIGIC 4 imaging processor
  • Expanded sensitivity range from ISO 50 to ISO 25,600
  • 16:9 Full HD video capture at 1920 x 1080 pixels and 30 fps as well as 4:3 standard TV quality (SD) video capture at 640 x 480 pixels and 30 fps



Read more here.

Labels: , ,

PubMatic interview with Amar Goel

PubMatic interview with Amar Goel, CEO PubMatic/Komli:
Amar talks about ad optimization, how PubMatic benefits publishers, and how ad-price index shows how adprices are trending:

Labels: , ,

Thursday, September 11, 2008

Chrome shine for Google

I wrote a review of Google Chrome Browser for Financial Express. It appeared on page-8 of the newspaper, and is also available online at - http://bit.ly/chromefe .

Check it out.

Labels: , , , ,

Wednesday, September 03, 2008

Google Chrome Tested

I have been testing Google Chrome for last 6 hours now. Here is a short review. Overall I don’t find anything overwhelmingly amazing that would make me WOW! The Omnibox is cool, but I feel that it makes a marginal enhancement in my browsing experience. The most important thing I noticed is that – Chrome is slow; it’s slower than FireFox. I have to admit that I am so used to Firebug, that I almost view the Net-element of Firebug every few minutes, I find such a functionality missing in Chrome. I like the Chrome->Developer->Debug Javascript, but it has marginal stuff that I need. Also, check this – in Chrome, click on Developer menu, and try to shift-Windows-tabs (by alt-tab), that doesn’t work. Why has Chrome disabled my Windows-alt-tab switch? Web site seem to work normally, rendering and JavaScript seems to be working fine.

I feel that Javascript execution is very very fast. I tested a page where FF normally give a Javascript-timeout (stop, continue …), Google Chrome ran just fine and delivered me the Javascript rendering in less time than expected. I will therefore need to do more Javascript testing on Chrome.

Though I am a little concerned about the vulnerability discussed here.

More later.

Labels: , , ,

Tuesday, August 12, 2008

Social network popularity around the world

Very interesting data on popularity of social networks posted on this web site. The social networks we included in this survey were MySpace, Facebook, Hi5, Friendster, LinkedIn, Orkut, Last.fm, LiveJournal, Xanga, Bebo, Imeem and Twitter.
  • Facebook is most popular in Turkey and Canada.
  • Friendster and Imeem are most popular in the Philippines.
  • LinkedIn is most popular in India.
  • Twitter is most popular in Japan.
  • LiveJournal is more popular in Russia than it is in the United States.
  • Orkut is more popular in Iran (10th country popularity-wise) than it is in the United States.
  • MySpace is the only social network which is most popular in the United States.
Read more here.

Servers: What do they really cost?

Really interesting post on Forbes.com about how much a server really costs, some excerpts are following:
  • Spending $2,500 on a server really means spending between $8,300 and $15,400 in facility capital to provide the necessary space for housing the server and powering it.
  • A single rack of blade servers will consume $1 million in facility capital. Pretty staggering! And this is the underlying source of the boom in data-center construction driven by the need to house 5 million additional servers.
Read more here.

Sunday, July 27, 2008

Cloud Availability

Cloud Computing has become very widespread with startups as well as divisions of banks, pharmaceuticals companies and other large corporations using them for computing and storage.
Amazon Web Services has led the pack with it's innovation and execution, with services such S3 storage service, EC2 compute cloud, and SimpleDB online database.

Many options exist today for cloud services, for hosting, storage and application hosting. Some examples are below:
Hosting Storage Applications
Amazon EC2 Amazon S3 opSource
MOSSO Nirvanix Google Apps
GoGrid Microsoft Mesh Salesforce.com
AppNexus EMC Mozy
Google AppEngine MOSSO CloudFS
flexiscale

[A good compilation of cloud computing is here, with a nice list of providers here. Also worth checking out is this post.]

The high availability of these cloud services becomes more important with some of these companies relying on these services for their critical infrastructure. Recent outages of Amazon S3 (here and here) have raised some important questions such as this - S3 Outage Highlights Fragility of Web Services and this.

[A simple search on search.twitter.com can tell you things that you won't find on web pages. Check it out with this search, this and this.]

There has been some discussion on the high availability of cloud services and some possible solutions. For example the following posts - "Strategy: Front S3 with a Caching Proxy" and "Responding to Amazon's S3 outage".

Here I am writing of some thoughts on how these cloud services can be made highly available, by following the traditional path of redundancy.

Cloud Availability configurations The traditional way of using AWS S3 is to use it with AWS EC2 (config #0).

Configurations such as on the left can be made to make your computing and storage not dependent on the same service provider.
Config #1, config #2 and config #3 mix and match some of the more flexible computing services with storage services.
In theory the compute and the storage can be separately replaced by a colo service.


The configurations on the right are examples of providing high availability by making a "hot-standby".

Config #4 makes the storage service hot-standby and config #5 separates the web-service layer from the application layer,
and makes the whole application+storage layer as hot-standby.

A hot-standby requires three things to be configured - rsync, monitoring and switchover.

rsync needs to be configured between hot-standby servers, to make sure that most of the application and data components
are up to date on the online-server. So for example in config #4 one has to rsync 'Amazon S3' to 'Nirvanix' - that's pretty
easy to setup. In fact, if we add more automation, we can "turn-off" a standby server after making sure that the
data-source is synced up. Though that assumes that the server provisioning time is an acceptable downtime,
i.e. the RTO (Recovery time objective) is within acceptable limits.

This also requires that you are monitoring each of the web services. One might have to do service-heartbeating -
this has to be designed for the application, this has to be designed differently for monitoring Tomcat, MySQL,
Apache or their sub-components. In theory it would be nice if a cloud computing service would export APIs,
for example an API for http://status.aws.amazon.com/ , http://status.mosso.com/ or http://heartbeat.skype.com/.
However, most of the times the status page is updated much later after the service goes down. So, that wouldn't help much.
Switchover from the online-server/service to the hot-standby would probably have to be done by hand.
This requires a handshake with the upper layer so that requests stop and start going to the new service
when you trigger the switchover. This might become interesting with stateful-services and also where
you cannot drop any packets, so quiscing may have to be done for the requests before the switchover takes place.


Above are two configurations of multi-tiered web-services, where each service is built on a different cloud service. This is a theoretical configuration, since I don't know of many good cloud services, there are only a few. But this may represent a possible future, where the space
becomes fragmented, with many service providers.

Config #7 is config #6 with hot-standby for each of the service layers.
Again this is a theoretical configuration.

Cost Impact
Any of the hot-standby configurations would have cost impact - adding any extra layer of high-availability immediately adds to the cost, at least doubling the cost of the infrastructure. This cost increase can be reduced by making only those parts of your infrastructure highly-available that affect your business the most. It depends on how much business impact does a downtime cause, and therefore how much money can be spent on the infrastructure.

One of the ways to make the configurations more cost effective is to make them active-active configuration also called a load balanced configuration - these configurations would make use of all the allocated resources and would send traffic to both the servers. This configuration is much more difficult to design - for example if you put the hot-standby-storage in active-active configuration then every "write" (DB insert) must go to both the storage-servers, writes (DB insert) must not complete on any replicas (also called mirrored write consistency).

Cloud Computing becoming mainstream
As cloud computing becomes more mainstream - larger web companies may start using these services, they may put a part of their infrastructure on a compute cloud. For example, I can imagine a cloud dedicated for "data mining" being used by several companies, these may have servers with large HDDs and memory and may specialize in cluster software such as Hadoop.

Lastly I would like to cover my favorite topic -why would I still use services that cost more for my core services instead of using cloud computing?
  1. The most important reason would be 24x7 support. Hosting providers such as servepath and rackspace provide support. When I give a call to the support at 2PM India time, they have a support guy picking up my calls – that’s a great thing. Believe me 24x7 support is a very difficult thing to do.
  2. These hosting providers give me more configurability for RAM/disk/CPU
  3. I can have more control over the network and storage topology of my infrastructure
  4. Point #2 above can give me consistent throughput and latency for I/O access, and network access
  5. These services give me better SLAs
  6. Security

Labels: , , , , , ,

Wednesday, July 23, 2008

Comparing Clouds: Amazon EC2, Google, AppNexus, and GoGrid

InfoWorld has an awesome article published by Peter Wayner, who compares various cloud computing services - Amazon, Google, AppNexus, and GoGrid. Read more here.

Following is an excerpt:

Labels: , , , , , ,

Tuesday, July 22, 2008

Scientists found a workable way of reducing CO2 levels

Physorg.com reports - Scientists say they have found a workable way of reducing CO2 levels in the atmosphere by adding lime to seawater. And they think it has the potential to dramatically reverse CO2 accumulation in the atmosphere, reports Cath O'Driscoll in SCI's Chemistry & Industry magazine published today.

Adding lime to seawater increases alkalinity, boosting seawater's ability to absorb CO2 from air and reducing the tendency to release it back again.

The important point is that when you put lime into seawater it absorbs almost twice as much carbon dioxide as is produced by the breaking down of the limestone in the first place.

The Cquestrate project has a web page; here's how they describe the idea; and here's a chance of getting involved. I think this is HUGE!

They also have this awesome slideshow presentation, which goes into the gory but interesting details of all the chemical reactions.

Here is their idea of how they would do it (subject to economic feasibility):
  • The Nullarbor Plain is the world's largest single piece of limestone and occupies an areas of about 200,000 km2
  • With an average thickness of 50m, there is 10,000 km3 of limestone
  • To sequester 1 billion tonnes of carbon (GtC) would require the excavation of 1.5 km3 of limestone
  • Between 1750 and 2004 humankind has emitted 305 GtC. Current emission rates are about 7 GtC per year
  • Thus employing this process on 500 km3 of limestone (about 5% of the limestone in the Nullarbor Plain) would return carbon dioxide levels to pre-industrial levels.

I hope all of this comes true. I think, economic viability of converting CaCO3 into CaO will be the biggest hurdle.

Labels: ,