On the 45th Anniversary of the Moon Landing: 5 Lessons the Apollo’s Program Manager taught me at MIT

I originally posted a version of this on five years ago, on 40th Anniversary of the Apollo Moon landing. At that time, social media and smartphone were just starting to explode. Today, as social sharing and mobile are giving rise to IoT, these lessons from 1969 are perhaps even more important.

Putting things in perspective

It is easy to feel really proud of our accomplishments, whether we are scaling a consumer application a 1,000-fold in one year, rolling out a huge ERP program or even creating a new technology.  However these accomplishments pale in comparison to what the Apollo, Gemini, and Mercury Missions achieved 45 years ago. Imagine this scenario:

You are listening to the radio and the President announces that the country is going to put a man on the Moon by the end of the decade. Keep in mind that no one has ever even escaped low earth orbit–let alone escaped Earth’s gravity, executed Holman transfers AND navigated to another body. Now you have to implement the largest engineering project in history, while inventing not only technologies, but also whole fields of study. All under the watch of the press—and all completed within one decade.

This is inconceivable to most of us in our work today. It is inspirational.

One small step for man (Credit: NASA)
Success: One small step for man, one giant leap for mankind. (Credit: NASA)

My lucky exposure to the people of Apollo

At the time I studied aerospace engineering at MIT, we were lucky enough to have several veterans of the Apollo Program on staff as our instructors. Not only were they great instructors; they also could recount first-hand experiences of events that the rest of us could only read about in the history books.

One of these professors was Joe Shea, the original Program Manager of NASA’s Apollo Program (portrayed by Kevin Pollack on HBO’s excellent series, “From the Earth to the Moon”). Contrary to what that series depicted, it was Joe who came up with concept of splitting the Apollo Program into missions that achieved never-before-achieved technology marvels.

Joe is also considered by some a founder of the Systems Engineering profession (many consider him the greatest systems engineer who ever lived). This made him the perfect person to each the capstone class of the aerospace curriculum:  Systems Engineering (Fred Wilson of USV has written a great post on how fun Systems Engineering is and how important it is for engineering leadership). Every year, he would get a project from NASA and guide his students through all aspects of design, simulation, planning and even cost analysis. Our midterms and finals were real-life presentations to the Administrator of NASA.

Under Joe, I got to work on something called “Project Phoenix,” returning to the moon—but now with a re-usable capsule and landing four astronauts at the pole and keeping there for 30 days (a much harder prospect). In this project I learned about everything from active risk management to critical path costing to lifting bodies to Class-E solar flares. (How cool was that for a 20-year-old?)

Life lessons I learned from Joe

The technical things I learned from Joe got me my first job at Lockheed Martin (then GE Aerospace). It was great to be able to say that I had worked on a NASA program, helped create both a PDR (Preliminary Design Review) and CDR (Critical Design Review) and present elements of them to the Administrator of NASA in Washington.

However, I learned five much more important lessons – independent of aerospace or any other technology – that I have used in the eighteen twenty-three years since:

  1. Break Big Challenges into Small Parts. Any obstacle can be achieved if you break it down to smaller items. If these are too large, break them down again. Eventually you will get to things that have clear, straightforward paths for success. Essentially this is the engineer’s version of “a journey of a thousand miles begins with a single step”
  2. Know Your Stuff Inside and Out. You cannot be a technology leader who only manages from above. You must understand how the components work. This is the only way you will see problems before they happen. Remember, you are the leader who is the only one positioned to connect the “Big Picture” to the execution details.
  3. S#!% Happens. Things break. Schedules are late. People leave the project. Plan for this. Ask yourself every week what can go wrong. Put contingency plans together to address the biggest or most likely of these. Today, this is done in everything from Risk Management to DevOps.
  4. There is No Such Thing as Partial Credit. Yes, unlike a rocket, you can “back out” (essentially un-launch) software. However, the costs of this type of failure are enormous: not only does it cost 3-5x more to back-out, fix and regression test changes, it also frequently results in lost revenue and customers. Get things right in development – then certify them in testing (not the other way around). Don’t count on being able to “back-out” after a failed launch–this will be come more and more true as we push software to millions of “things” comprising IoT. Joe hammered a lesson into our heads with a chilling story: when people forgot this and rushed three astronauts died during a basic systems test on the Apollo 1.
  5. Take Ownership. If you are the leader, you are responsible for the team’s or product’s success. If you are a line manager, you are not only responsible for your area but are being relied upon by your peers for success. If you are a hands-on analyst or engineer you are actually delivering the work that leads to success. In all cases, ensure you do your job right, ask for help when you need it and never lie or hide anything.

Five really important lessons. I am grateful I had the opportunity to learn them before I entered the full-time career work force. I try to “pay this back” by teaching these lessons and concepts everywhere I go.

Before I forget…

Thank you to the men and women of Apollo. Thank you also to the men and women of Gemini and Mercury (it is easy to forget them on this day). You achieved miracles on a daily basis and inspired whole generations of scientists and engineers.

john_oliver_net_neutrality_rant

Twitter traffic jams in Washington, created by… John Oliver

Note: This post was first published as Twitter Sensors: Detecting the Traffic Jam in Washington Caused by… John Oliver on the Savi Technology Blog.

Summary: In the first week of June, 20% of the Tweets about traffic, delays and congestion by people around the Washington Beltway were caused by John Oliver’s “Last Week Tonight” segment about Net Neutrality.

At work, we are always exploring a wide range of sensors to obtain useful insights that can used to make work and routine activities faster, more efficient and less risky. One of our Alpha Tests is examining use of “arrays” of high-targeted Twitter sensors to detect early indications of traffic congestion, accidents and other sources of delays. Specifically we are training our system how to use Twitter is a good traffic sensor (by good, in “data science speak” we are determining whether we can train a model for traffic detection that has a  good balance of precision and recall, and hence a good F1 Score). To do this, I setup a test bed around the nation’s second-worst commuter corridor: the Washington DC Beltway (our my backyard).

Earlier this month our array of geographic Twitter sensors picked up an interesting surge in highly localized tweets about traffic-related congestion and delays. This was not an expected “bad commuter-day”-like surge. The number of topic- and geographically-related tweets seen on June 4th was more than double the expected number for a Tuesday in June around the Beltway; the number seen during lunchtime was almost 5x normal.

So what was the cause? Before answering, it is worth taking a step back.

The folks at Twitter have done a wonderful job at not only allowing you to fetch tweets based on topics, hash tags and geographies. They have also added some great machine learning-driven processing to screen out likely spammers and suspect accounts. Nevertheless Twitter data, like all sensor data, is messy. It is common to see Tweets with words spelled wrong, words used out of context, or simply nonsensical Tweets. In addition, people frequently repeat the same tweets throughout the day (a tactic to raise social media exposure) and do lots of other things that you must train the machine to account for.

That’s why we use a Lambda Architecture to process our streaming sensor data (I’ll write about why everyone–from marketers to DevOps staff should be excited about Lambda architectures in a future post). As such, not only do use Complex Event Processing (via Apache Storm) to detect patterns as they happen; we also keep a permanent copy of all raw data that we can explore to discover new patterns and improve our machine learning models).

That is exactly what we did as soon as we detected the surge. Here is what we found: the cause of the traffic- and congestion-related Twitter surge around the Beltway was… John Oliver:

  1. In the back half of June 1st’s episode of “Last Week Tonight” (HBO, 11pm ET), John Oliver had an interesting 13-minute segment on Net Neutrality. In this segment he encouraged people to visit the FCC website and comment on this topic.
  2. Seventeen hours later, the FCC tweeted that “[they were] experiencing technical difficulties with [their] comment system due to heavy traffic.” They tweeted a similar message 74-minutes later.
  3. This triggered a wave of re-tweets and comments about the outage in many places. Interestingly this wave was delayed in the Beltway. It surged the next day, just before lunchtime in DC, continuing throughout the afternoon. The two spikes were at lunchtime and just after work . Evidently, people are not re-tweeting while working. The timing of the spikes also reveals some interesting behavior patterns on Twitter use in DC.
  4. By 4am on Wednesday the surge was over. People around the Beltway were back to their normal tweeting about traffic, construction, delays, lights, outages and other items confounding their commute.

Of course, as soon as we saw the new pattern, we adjusted our model to account for this pattern. However, we thought it would be interesting to show in a simple graph how much “traffic on traffic, delays and congestion” Mr. Oliver induced in the geography around the Beltway for a 36-hour period. Over the first week of June, one out of every five Tweets about traffic, delays and congestion by people around the Beltway were not about commuter traffic, but instead around FCC website traffic caused by John Oliver:

Tweets from people geographically Tweeting around the Washington Beltway on traffic, congestion, delays and related frustration for first week of June. (Click to enlarge.)
Tweets from people geographically Tweeting around the Washington Beltway on traffic, congestion, delays and related frustration for first week of June. (Click to enlarge.)

Obviously, a simple count of tweets is a gross measure. To really use Twitter as a sensor, one needs to factor in many other variables: use text vs. hash-tags, tweets vs. mentions and re-tweets, the software client used to send the tweet (e.g., HootSuite is less likely to be a good source for accurate commuter traffic data); the number of followers the tweeter has (not a simple linear weighting) and much more. However, the simple count is simple first-order visualization. It also makes interesting “water-cooler conversation.”

The Expanding (Digital) Universe: Visualizing How BIG a Zettabyte Really Is

Note: This post was originally published at Oulixeus Consulting

A lot of news articles recently (Google News currently shows 1,060 articles) are citing the annual EMC-IDC Digital Universe studies of the massive growth of the digital universe through 2020. If you have not read the study, it indicates that the digital universe is now doubling every two years and will grow 44-fold 50-fold now 55-fold from 0.8 Zettabytes (ZB) of data in 2009 to 35 40 now 44 Zettabytes in 2020. (Every year IDC has revised the growth curve upward by several Zettabytes.)

Usually these articles show a diagram such as this:

DigitalDecade

This type of diagram is great at showing how much 44-fold growth is. However it really does not convey how big a Zettabyte really is—and how much data we will be swimming (or drowning in) by 2020.

A Zettabyte (ZB) is really, really big – in terms of today’s information systems. It is not a capacity that people encounter every day. It’s not even in Microsoft Office’s spell-checker, Word “recommended” that I meant to type “Petabyte” instead ;)

The Raw Definition: How big is a Zettabyte?

A Computer Scientist will tell you that 1 Zettabyte  is 270 bytes. That does not sound very big to a person who does not usually visualize think in exponential or scientific notation—especially given that a one-Terabyte (1 TB) solid state drive has a capacity to store 240 bytes.

Wikipedia describes a ZB (in decimal math) as one-sextillion bytes. While this sounds large, it is a hard to visualize. It is easier to visualize 1 ZB (and 44 ZBs) in relation to things we use everyday.

Visualizing Zettabytes in Units of Smartphones

The most popular new smartphones today have 32 Gigabytes (GB) or 32 x 230 bytes of capacity. To get 1 ZB you would have to fill 34,359,738,368 (34.4 billion) smartphones to capacity. If you put 34.4 billion Samsung S5’s end-to-end (length-wise) you would circle the Earth 121.8 times:

1ZB-Earth-Distance
Click to see a higher resolution image and the dot that represents Earth to-scale vs. the line

You can actually circumnavigate Jupiter almost 11 times—but that is not obvious to visualize.

The number of bytes in 44 Zettabytes is a number too large for Microsoft Excel to compute correctly. (The number you will get is so large that Excel will cut off seven digits of accuracy–read that as a potential rounding error up to one million bytes). Assuming that Moore’s Law will allow us to double the capacity of smartphones three times between now and 2020, it would take 188,978,561,024 (188+ trillion) smartphones to store 44 ZB of data. Placing these end-to-end- would circumnavigate the world over nearly 670 times.

This is too hard to visualize, so lets look at it another way. You could tile the entire City of New York two times over (and the Bronx and Manhattan three times over) with smartphones filled to capacity with data to store 44 ZBs. That’s a big Data Center!

Clik
Amount of Smartphones (with 2020 tech) you would need to store 44 ZB (click for higher resolution)

This number also represents 25 smartphones per person for the entire population of the planet. Imagine the challenge of managing data spread out across that many smartphones.

Visualizing Zettabytes in Units of Facebook

Facebook currently stores 300 Petabytes (PB) or 300 x 250 bytes of data in their data warehouse (this is growing at 600 TB per day). This is rather enormous—especially for a private enterprise. However it is much, much smaller than even one Zettabyte: 1 ZB could contain 3,495 entire Facebook Data Warehouses:

Ffff
1 ZB would contain 3,495 entire Facebook Data Warehouses (click for zoomable  resolution)

Facebook leadership has expressed the desire to get everyone on Earth connected. Today, Facebook has 1.3 Billion Monthly Active Users. If they got all 7.5 billion people projected to be alive in 2020—and data continued to grow 10-fold between now and then—their data warehouse would still only be 1/60th of 1 ZB:

7.5 billion MAUs with 10x increase in data use is still only 1/60th of a ZB
7.5 billion MAUs AND a 10x increase in data use is still only makes the DW 1/60th of a ZB

Expanding this to 44 ZB increases the number to 153,791 Facebook Data Warehouses. A grid display of this number is beyond the resolution of even Retina Display monitors. You could visualize this with “Bat Cave-like” grid of four rows of 11 hi-resolution monitors. However, that is impractical and expensive (~$40,000).

Really Big No Matter How You Look At It

One Zettabyte is indeed very, very big. Growing from just under 1 ZB to over 44 ZBs in a little over a decade is astounding. Managing this—whether you a technologist responsible for managing data, a business user who needs to use data, of a consumer just trying to manage the flood of all of your personal data—will be a challenge for all.

5 points where tech balances between life and business