Monday, December 30, 2013

A Cautionary Bedtime Story

Once upon a time, when all the world and you and I were young and beautiful, there lived in the ancient town of Metrika a young boy by the name of Joe.

Now, young Joe was a talented lad, and his home town was prosperous and filled with happy folk - Metricians, they were called. Joe was a member of the Econo family, and his ancestors had been among the founding-fathers of the town. Originating in the neighbouring city of Econoville, Joe Econometrician's forebears had arrived in Metrika not long after the original settlers of that town - the Biols (from nearby Biologica), and the unfortunately named Psychos (from the hamlet of Psychovia).

In more recent times, other families (or "specialists", as they were sometimes known) had also established themselves in the town, and by the time that Joe was born there was already a sprinkling of Clios (from the ancient city of Historia), and even a few Environs. Hailing from the suburbs of Environmentalia, the Environs were regarded with some disdain by many of the more established families of Metrika.

Metrika began as a small village - little more than a coach-stop and a mandatory tavern at a junction in the highway running from the ancient data mines in the South, to the great city of Enlightenment, far to the North. In Metrika, the transporters of data of all types would pause overnight on their long journey; seek refreshment at the tavern; and swap tales of their experiences on the road.

To be fair, the data transporters were more than just humble freight carriers. The raw material that they took from the data mines was largely unprocessed. The vast mountains of raw numbers usually contained valuable gems and nuggets of truth, but typically these were buried from sight. The data transporters used the insights that they gained from their raucous, beer-fired discussions and arguments (known locally as "seminars") with the Metrika yokels locals at the tavern to help them to sift through the data and extract the valuable jewels. With their loads considerably lightened, these "data-miners" then continued on their journey to the City of Enlightenment in a much improved frame of mind, hangovers nothwithstanding!

Over time, the town of Metrika prospered and grew as the talents of its citizens were increasingly recognized and valued by those in the surrounding districts, and by the data miners transporters.

Young Joe grew up happily, supported by his family of econometricians, and he soon developed the skills that were expected of his societal class. He honed his computing skills; developed a good nose for "dodgy" data; and studiously broadened and deepened his understanding of the various tools wielded by the artisans in the neighbouring town of Statsbourg.

In short, he was a model child!

But - he was torn! By the time that he reached the tender age of thirteen, he felt the need to make an important, life-determining, decision.

Should he align his talents with the burly crew who frequented the gym near his home - the macroeconometricians - or should he throw in his lot with the physically challenged bunch of empirical economists known locally as the microeconometricians?

What a tough decision! How to decide?

He discussed his dilemma with his parents, aunts, and uncles. Still, the choice was unclear to him.

Then, one fateful day, while sitting by the side of the highway and watching the data-miners pass by with their increasingly heavy  loads, the answer came to him! There was a simple solution - he would form his own break-away movement that was free of the shackles of his Econo heritage.

Overwhelmed with excitement, Joe raced back to the tavern to announce to the seminar participants locals that henceforth he was to be known as a Data Scientist.

As usual, the locals largely ignored what he was saying, and instead took turns at talking loudly about things that they thought would make them seem important to their peers. Finally, though, after many interruptions, and the consumption of copious quantities of ale, Joe was able to hold their attention.

"You see", he said, "the data that are now being mined, and transported to the City of Enlightenment, are available in such vast quantities that the truth must lie within them."

"All of this energy that we've been expending on building economic models, and then using the data to test their validity - it's a waste of time! The data are now so vast that the models are superfluous."

(To be perfectly truthful, he probably used words of one syllable, but I think you get the idea.)

"We don't need to use all of those silly simplifying assumptions that form the basis of the analysis being undertaken by the microeconometricians and macroeonometricians."

(Actually, he slurred these last three words due to a mixture of youthful enthusiasm and a mouthful of ale.)

"Their models are just a silly game, designed to create the impression that they're actually adding some knowledge to the information in the data. No, all that we need to do is to gather together lots and lots of our tools, and use them to drill deep into the data to reveal the true patterns that govern our lives."

"The answer was there all of the time. While we referred to those Southerners in disparaging terms, calling them "data miners" as if such activity were beneath the dignity of serious modellers such as ourselves, in reality data-mining is our future. How foolish we were!"

Now, it must be said that there were a few older econometricians who were somewhat unimpressed by Joe's revelation. Indeed, some of them had an uneasy feeling that they'd heard this sort of talk before. Amid much head-scratching, beard-stroking, and ale-quaffing, some who were present that day swear they heard mention of long-lost names such as Koopmans and Vining. Of course, we'll never know for sure.

However, young Joe was determined that he had found his destiny. A Data Scientist he would be, and he was convinced that others would follow his lead. Gathering together as many calculating tools as he could lay his hands on, Joe hitched a ride North, to the great City of Enlightenment. The protestations of his family and friends were to no avail. After all, as he kept insisting, we all know that "E" comes after "D".

And so, Joe was last seen sitting in a large wagon of data, trundling North while happily picking through some particularly interesting looking nuggets, and smiling the smile of one who knows the truth.

To this day, econometricians gather, after a hard day of modelling, in the taverns of Metrika. There, they swap tales of new theories, interesting computer algorithms, and even the characteristics of their data. Occasionally, Joe's departure from the town is recalled, but what became of him, or his followers, we really don't know. Perhaps he never actually found the City of Enlightenment after all. (Shock, horror!)

And that, dear children, is what can happen to you - yes, even you - if you don't eat all of your vegetables, or if you believe everything that you hear at seminars the tavern.



© 2013, David E. Giles

3 comments:

  1. Well . . . yes. Data mining might not apply to economics (macro or micro) or to much of any other science, for that matter. Testing theories involves stating assumptions, explicitly modeling or otherwise predicting outcomes, then testing the outcome.

    Predictive analytics doesn't do this. It is strictly a black box methodology with some information theory and a bit of statistical theory suggesting ways to tune the black box for each iteration of refining the black box (for what passes as a model in predictive analytics is increasingly opaque). Assessment of whether the black box is useful is whether or not requires some measure or set of measures (accuracy, recall, precision) which are maximized against holdout samples.

    I suspect that machine learning methods will be increasingly useful for prediction--things like whether a set of pixels forms a letter, whether a sound pattern forms a syllable, whether a set of transactions involves fraud. And to the extent that it grounds prediction in actual behavior, it might be useful for people doing economic work. It might suggest a "ground" on which to form useful microfoundations, or a filter for removing unlikely macroeconomic variables. That alone suggests economists should get their hands dirty with the new technologies.

    Disclaimer: I have a modest background in statistics and machine learning, some experience with psychometrics (and the unfortunately named psychos) and time series models. I'm interested in both mathematical techniques and the tricky business of deduction in the social sciences, which is why I follow this blog as well as Andrew Gellman and Judea Pearl. My misinterpretations are all my own.

    ReplyDelete
  2. Seems to me the much more effective machines for refining data are a pretty important development in this story. I just wish the Econos would pay more attention to the Mathmo Clan before spouting off to foreigners like the Icos from the Empire of Polit.or the great port of Journ and it's renowned Alism clan.

    ReplyDelete

Note: Only a member of this blog may post a comment.